Journal of Applied Psychology 


Edited by Donald G. Paterson, University of Minnesota 


Consulting Editors 


George K. Bennett, Psychological Corporation 
Harold E. Burtt, Ohio State University 
Allen L. Edwards, University of Washington 
Clifford E. Jurgensen, Minneapolis Gas Co. 
Irving Lorge, T. C. Columbia University 
Quinn McNemar, Stanford University 
Alexander Mintz, City College of New York 


James P. Porter, Claverack, New York 
Harold F. Rothe, Fairbanks, Morse and Co., 
Beloit, Wis. 
Julian B. Rotter, Ohio State University 
Edward K. Strong, Jr., Stanford University 
Donald E. Super, T. C. Columbia University 
Morris S. Viteles, University of Pennsylvania 
Alfred C. Welch, Knox-Reeves, Minneapolis 





Table of Contents 


The Measurement of Leadership Attitudes in Industry: E. A, Fleishman 

Productivity and Attitude Toward Supervisor: C. H. Lawshe and B. F. Nagle 

A Simplified Procedure for the Measurement of Employee Attitudes: M. E. Baehr 

The Motivation Factor in Testing Supervisors: E. E. Jennings 

The Minnesota Engineering Analogies Test: M. D. Dunnette 

The Humm-Wadsworth Temperament Scale as an Indicator of the ‘Problem’ Employee: A. R. 


Gilliland and S. E. Newman 


The Prediction of Success and Failure in Elementary Foreign Language Courses: H. C. Peters.. 178 
Predicting Grades in Advanced College Mathematics: J. R. Kinzer and L. G. Kinzer 
A New Method for Determining Readability of Standardized Tests: F. W. Forbes and W. C. 


Cottle 


A Modified Administration Procedure for the O'Connor Finger Dexterity Test: E. A. Fleishman. 191 
4 Comparison of the Revised Allport-Vernon Scale of Values (1951) and the Kuder Preference 
195 


Record (Personal): 1. Iscoe and O. Lucier 


Administering Form: BB of the Kuder Preference Record, Half Length: A. A. Canfield 
Attitudes Toward Public Low-Rent Housing, Before and After Construction: K. E. Clark and 


C. E. Swanson 


Group Performance in a Manual Dexterity Task: A. L. Comrey 

Response Time as an Indicator of Color Deficiency: S. Ross and J. L. Fletcher.............. 

The Effect of Set on Performance in a‘‘Trouble Shooting” Situation: N. A. Fattuand E. V. Mech 214 
An Evaluation of Two Experimental Charts as Navigational Aids to Fet Pilots: J. E. Murray. . 218 


The Relationship between Scotopic Visual Acuity and Acuity at Photopic and een ee 
Levels: J. E. Uhlaner, D. A. Gordon, I. A. Woods, and J. Zeidner... ... —- ane 


The Influence of Increased Positive g on Reaching Movements: A. A. Canfield, ‘“ L. Caine 


and R. C. Wilson 

Applied Psychology in Action: 
Editor’s Note 
Fob Supervision of Young Workers 


Personnel Psychology in a Steel Company....... 
NN eo iiaia 46s maine aia ainew ealediaeinss 


New Books, Monographs, and Pamphlets 





American Psychological Association 


Vol. 37, No. 3 


June, 1953 





Journal of Applied Psychology 


Published Bi-monthly by the American Psychological Association, Inc. 
Prince and Lemon Sts., Lancaster, Pa. 


Annual subscription, $6.00; single copies, $1.25 


Subscriptions and business communications should be sent to 


American Psychological Association 
1333 Sixteenth Street N.W. 
Washington 6, D. C. 


Articles for publication and books for review should be sent to the Editor 


Professor Donald G. Paterson, Department of Psychology 
University of Minnesota, Minneapolis 14, Minnesota 





This Journal gives prompt consideration to 
manuscripts reporting original investigations in 
any field of applied psychology except clinical 
and consulting psychology. A descriptive or 
theoretical article is occasionally accepted if it 
deals in a distinctive manner with a problem of 
applied psychology. The policy is, however, to 
favor papers dealing with quantitative investi- 
gations of direct value to psychologists working 
in the following fields: Vocational diagnosis and 
occupational guidance; educational diagnosis, 
prediction and guidance at the secondary school 
level and higher; personnel selection, training, 
placement, transfer and promotion in business, 
industry and government service including the 
armed forces; supervisory training in business, 
industry and government; bio-mechanics or de- 
sign of machines to fit the human operator; il- 
lumination, ventilation and fatigue in industry; 
job analysis, description, classification and eval- 
uation; measurement of morale of executives, 
supervisors, or employees; surveys of opinion on 
social or political issues, such as those conducted 
by The Psychological Corporation ; psychological 
problems in market research and in advertising. 


Articles may be under 500 words. The maxi- 
mum is 12,000 words, the average in the 


neighborhood of 4,000 words. To reduce lag of 
publication, adherence to the rule of “brevity 
consistent with clarity” is encouraged. 


A lapse of six to twelve months occurs between 
acceptance of an article and its publication, the 
lag varying with the rate at which manuscripts 
are submitted. If, however, an author is pre- 
pared to defray the costs of printing the neces- 
sary extra pages, he may arrange for earlier 
publication without thereby postponing the ap- 
pearance of manuscripts by other contributors. 
This enables the management to provide space in 
addition to the scheduled 64 pages per issue. 
“Early publication” is thus a direct contribution 
to the subscribers. By cutting down lag in pub- 
lication, it also benefits those authors whose 
articles are published in regular turn. 


Tables, footnotes and references as well as 
text of manuscripts should be typed double-spaced 
throughout. Authors should adhere to the con- 
ventions described in the “Publication Manual 
of the American Psychological Association,” 
Psychol. Bull., 1952, 49, No. 4, Part 2. A copy 
of the Manual will be loaned to any prospective 
contributor who does not find it in his library. 


Entered as second-class matter, August 19, 1943, at the post office at Lancaster, Pa., under the act of March 3, 1879 


Acceptance for mailing at the special rate of postage provided for in paragraph (d-2), Section 34.40, 
P. L. & R. of 1948, authorized October 10, 1947 : 


Copyright, 1953, by The American Psychological Association, Inc. 





Journal of Applied Psychology 








VoL. 37, No. 3 


JUNE, 1953 








The Measurement of Leadership Attitudes in Industry 


Edwin A. Fleishman * 


USAF Air Training Command, Human Resources Research Center ** 


Recent years have seen an intensified con- 
cern in industry for the problems of leader- 
ship and human relations. Evidence of this 
can be seen in the increasing number of lead- 
ership training programs which have been in- 
stituted in various industries. However, those 
who train supervisors must still rely on a lim- 
ited number of general assumptions largely 
unsupported by either sound theory or em- 
pirical data. Part of this difficulty arises 
from the fact that effectual leadership de- 
pends to a great extent on the situation. Ad- 
ditional difficulties stem from the lack of 
adequate criteria of group effectiveness. A 
pressing need is the development of depend- 
able research instruments which can be uti- 
lized to describe adequately the various com- 
plex socio-psychological aspects of a wide 
variety of leader-group situations.’ If these 
were available, they might later be related to 
criteria of group effectiveness in many specific 
situations in which leaders function. The 
present study was a further attempt to de- 
velop a number of such instruments which 
would have application in industry. 


* This research was carried out while the writer 
was at the Personnel Research Board, Ohio State 
University, as part of a larger project on leadership 
in industry, with the cooperation of the International 
Harvester Company. 

** Lackland Air Force Base, San Antonio, Texas. 
The opinions or conclusions contained in this report 
are those of the author. They are not to be con- 
strued as reflecting the views or indorsement of the 
Department of the Air Force. 

1 The Personnel Research Board has made the de- 
velopment of such instruments a major part of their 
leadership research program. See Stogdill and Shartle 
(10), Shartle (9), Seeman (8), Halpin and Winer 
(4), Hemphill (5), Hemphill and Westie (6), and 
Fleishman (1, 2). Another approach to the meas- 
urement of leadership attitudes has been made by 
Nelson (7). 


In a previous paper (2) the writer has de- 
scribed a questionnaire found useful for the 
description of supervisory behavior. The 
present paper describes questionnaires which 
were developed for the measurement of lead- 
ership attitudes. 


Construction of the Questionnaire 


A preliminary 110-item Leadership Opin- 
ion Questionnaire was administered to 100 
foremen in a pilot study at the company’s 
Central School. These foremen represented 
17 different company plants. The foreman 
indicated for each item how frequently he 
thought he should do what each item de- 
scribed. He responded by marking one of 
five frequency alternatives which followed 
each item (e.g., always, often, occasionally, 
seldom, never). He was told that there were 
no right or wrong answers in the question- 
naire since “everyone’s work group is different 
and what is the best way to lead one group 
may not be the best way for another.” 

The items in this questionnaire were gen- 
erally parallel to those in the pre-test form of 
the Supervisory Behavior Description previ- 
ously described (2). However, in this latter 
questionnaire the items were worded in terms 
of “what does your own supervisor actually 
do” while in the present questionnaire items 
were worded in terms of “what should you 
do.” The questionnaire was scored along two 
major and two minor dimensions.? Of the 


2 These dimensions were originally isolated in a 
factor analysis of the items of a Leadership Behavior 
Description questionnaire administered to 300 Air 
Force crew members who described their airplane 
commander (4). Later analysis of the items, based 
on this industrial population, supported only the 
two major factors “Consideration” and “Initiating 
Structure” (2). 
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two major dimensions, one was called “Con- 
sideration,” which contained items reflecting 
the extent to which the supervisor is consid- 
erate of the feelings of those under him. It 
comes closest to representing the “human re- 
lations” approach toward group members. 
The second major dimension was called “Ini- 
tiating Structure,” and contained items re- 
flecting the extent to which the supervisor 
facilitates or defines group interactions to- 
ward goal attainment. He does this by plan- 
ning, communicating, scheduling, criticizing, 
initiating new ideas, etc. The two minor fac- 
tors were called, “Production Emphasis,” and 
“Social Sensitivity.” Response distributions 
were obtained for the alternatives to each of 
the items in the questionnaire. 

The corrected split-half reliability esti- 
mates for the two major keys “Consideration” 
and “Initiating Structure’ were .69 and .73, 
respectively, and for the two minor keys 
“Production Emphasis” and “Social Sensi- 
tivity” the reliabilities were 36 and .33, re- 
spectively. In the light of the low reliabil- 


ities of the latter two keys and in view of the 
fact that a modified factor analysis of the 


items in the parallel Supervisory Behavior 
Description indicated that only the two ma- 
jor dimensions were meaningful in this indus- 
trial population, the dimensions of “Produc- 
tion Emphasis” and “Social Sensitivity” were 
omitted from the revised form.* 

The criteria for selecting items for the re- 
vised form included: (1) the response dis- 
tributions of the items in the Leadership 
Opinion Questionnaire; and (2) the factor 
loadings, based on this industrial population, 
of parallel items on the Supervisory Behavior 
Description. Items were favored whose par- 
allel item had a high loading on the factor in 
which it was keyed and insignificant loading 
on the other factor. It was hoped that this 
procedure would yield two scales tapping rela- 
tively independent leadership attitude dimen- 
sions. 

Twenty items were selected in this manner 
for the “Consideration” key and 20 items 


8 Actually, in this analysis it appeared that in the 
industrial sample, “Initiating Structure” and “Pro- 
duction Emphasis” were reflections of the same un- 
derlying dimension, as were “Consideration” and “So- 
cial Sensitivity.” 
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were selected for the “Initiating Structure” 
key. Examples of items in the revised “Con- 
sideration” key were: 


Help people in the work group with their 
personal problems. 

Back up what people under you do. 

Speak in a manner not to be questioned 
(response weights reversed). 


Examples of items in the revised “Initiat- 
ing Structure” key were: 


Emphasize meeting of deadlines. 

Assign people in the work group to particu- 
lar tasks. 

Meet with the work group at regularly 
scheduled times. 


Administration of the Revised Questionnaires 


Various forms of the revised questionnaire 
were administered in one of the company’s 
plants. 

A total of 122 foremen filled out the fol- 
lowing forms with the indicated response 
“sets”: 

(1) A Leadership Opinion Questionnaire: 
How he thinks he should lead his own work 
group. 

(2) A questionnaire entitled, “What Your 
Boss Expects of You”: A description of how 
the foreman feels his boss wants him to lead 
the work group. 

A total of 394 workers filled out a question- 
naire entitled, “How You Expect an Ideal 
Foreman to Act.” This is a description of 
worker expectations regarding leadership be- 
havior. 

Also, 60 supervisors above the rank of fore- 
man filled out the following questionnaires: 

(1) A Leadership Opinion Questionnaire: 
How the boss thinks he should lead the fore- 
men under him. 

(2) A questionnaire entitled, “What You 
Expect of Your Foremen”’: A description by 
the boss of how he wants his foremen to lead 
their workers. 

All these forms are variations of the Lead- 
ership Opinion Questionnaire revised on the 
basis of the pilot study. All contained the 
same 40 items reworded slightly in certain 
forms to apply to the appropriate situational 
context. 





Measurement of Leadership Attitudes in Industry 


Table 1 


Means, Range, Standard Deviations, Reliabilities, and Intercorrelations of 
Dimension Scores in Each Revised Instrument 


Instrument 


Filled out by foremen (N = 122) 
Leadership 
Opinion 
Questionnaire 


Dimension! 


Consideration 
Initiating Structure 


“What Your Boss 
Expects of You” 


Consideration 
Initiating Structure 
Filled out by workers (N = 394) 

“How You Expect 
an Ideal Foreman 
to Act” 


Consideration 
Initiating Structure 
Filled out by foreman’s boss (N = 60) 

“What You Expect 
of Your Foremen”’ 


Consideration 
Initiating Structure 


Consideration 
Initiating Structure 


Leadership Opinion 
Questionnaire 


' Each dimension key in each questionnaire contained 20 items. 


Mean 
Score 


Reliability 
Estimate’ 


Inter- 


Range?® correlation 


36 to 74 
34 to,69 


21 to 68 
31 to 68 


41 to 70 


44.2 26 to 58 


53.0 
54.0 


38 to 67 d 
37 to 68 78 
58.0 
52.4 


40 to 75 


31 to 69 ~ 2 


82 


? Since alternatives to each item were weighted zero to four, the highest possible score is 80 in each question- 


naire for each dimension. 


3 Split-half correlations corrected to full length of each dimension by the Spearman-Brown formula. 


Results 


Adequacy of the Questionnaires. On all 
the instruments, the five alternative responses 
to each item were assigned weights from zero 
to four. Whether the high frequency alterna- 
tive (e.g., always) was weighted zero or four 
depended on the item’s orientation with re- 
spect to the total dimension continuum. To- 
tal dimension scores were derived by adding 
the weights corresponding to the alternatives 
marked for the items in each dimension. Ta- 
ble 1 presents a summary of the means, range 
of scores, standard deviations, reliabilities and 
dimension intercorrelations for each instru- 
ment. 

The striking feature of Table 1 is the inde- 
pendence of the two dimensions in each of the 
forms used. This is especially true when the 
forms are filled out by workers and by fore- 
men. The correlations cluster about zero. 
Even in the case of the foremen’s supervisors, 
these correlations are low relative to those 
usually obtained with such instruments and 
do not reach the 1% level of significance. 


The important thing in interpreting the relia- 
bility coefficients is their magnitude relative 
to the dimension intercorrelations. Appar- 
ently, these instruments tap reliably two in- 
dependent dimensions of leadership attitudes. 
This is especially interesting since a criterion 
for item inclusion was the loading of a par- 
allel item in a Supervisory Behavior Descrip- 
tion questionnaire. An ideal but time con- 
suming procedure would have been to repeat 
the factor analysis on the attitude form but 
the independence of dimensions seems to have 
been accomplished by the present procedure. 
At least it appears that the usual “halo” ef- 
fect, which often inflates the intercorrelation 
among keys in instruments of this type, has 
been efficiently partialed out in the revised 
form. The distributions of scores obtained 
from most of the questionnaires are roughly 
normal in shape. 

The implication of these findings seems to 
be that the dimensions of “Consideration” 
and “Initiating Structure” are as meaningful 
and as independent in the attitudinal domain 
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Table 2 


Comparison of the Leadership Attitude Scores of 
Workers, Foremen, General Foremen, 
and Superintendents 


Level in the 


Dimension Organization Mean S.D. 


Superintendents 52.6 8.1 
(N = 13) 
General Foremen 
(N = 30) 
Foremen 
(N = 122) 
Workers 
(N = 394) 


“Consideration” 





Superintendents 
(N = 13) 
General Foremen 
(N = 30) 
Foremen 
(N = 122) 
Workers 
(N = 394) 


“Initiating Structure”’ 


' Indicates this mean differs significantly (beyond the 
O1 level of contidence) from the mean of the foremen 
group. 


of leadership as in the behavioral realm. It 
thus appears that supervisors may be high in 
the amount of consideration they feel should 
be shown their subordinates, but at the same 
time may be either low or high in the amount 
of planning, criticizing, pushing for produc- 
tion, and general “structuring” behavior that 
they feel they should engage in. There is also 
the indication that workers who want a great 
deal of “consideration” in their foremen do 
not mecessarily want less “structuring” or 
more “structuring” of their work activities 
from him. 

Attitudes at Different Levels. The ques- 
tionnaires entitled “What You Expect of 
Your Foremen”’ (filled out by supervisors), 
Leadership Opinion Questionnaire (filled out 
by foremen), and “How You Expect an Ideal 
Foreman to Act” (filled out by workers) all 
measure the respondents’ values about how 
the work groups should be led. The mean 
scores on each instrument provide a compari- 
son of these leadership attitudes at four levels 
in the plant. Table 2 presents this compari- 
son for four clear-cut organizational levels. 


A. Fleishman ‘ 
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The comparison shows that the attitudes of 
the foreman group are much more like the at- 
titudes of superiors than they are like the 
attitudes of the workers. Differences be- 
tween the mean scores of the foremen and 
their bosses are not statistically significant. 
Differences between the scores of the foremen 
and those of the workers are highly signifi- 
cant. This is true of scores on both leader- 
ship dimensions. The workers prefer more 
“consideration” and less “structure.”* It 
also appears that the higher people were in 
the plant hierarchy, the less “consideration” 
they felt the workers should get. Moreover, 
the higher the level, the more “structuring” 
the people felt should be initiated with the 
work group. However, some of these differ- 
ences were not large or significant although 
consistent. The tendency was for the fore- 
men’s attitudes to fall somewhere between 
what the workers expect and what their su- 
pervisors expect. 

Table 2 also indicates the relatively small 
standard deviations of the scores made by 
workers on both dimensions of the form “How 
You Expect an Ideal Foreman to Act.’ In 
each dimension the differences between the 
sigmas of worker attitude scores and that for 
supervisor attitude scores are statistically sig- 
nificant (P< .01). It appears that the 
workers are more homogeneous with respect 
to their leadership expectations than are the 
supervisory groups with respect to how they 
feel groups should be led. However, these 
scores present little evidence revealing the 
existence of an “ideal leadership” stereotype 
among workers since there was still a con- 
siderable range of scores on both expected 
“consideration” and expected ‘‘structure” at 
the worker level (see Table 1). 

It was also possible to compare the leader- 
ship attitudes of supervisors above the rank 
of foreman toward foremen and workers. 
This comparison is between scores made by 
these supervisors on the Leadership Opinion 
Questionnaire and the form “What You Ex- 
pect of Your Foremen.”” This comparison is 
presented in Table 3. 


‘It is interesting to note that although the work 
ers are generally on a piece rate basis, they prefer 
less “structuring,” which consists in large part of 
foremen activities pushing for production. 
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Table 3 


Comparison of the Leadership Attitudes of Foremen’s Supervisors Toward 
Workers and Foremen (N = 60 Supervisors) 


Leadership Attitudes 


Toward Workers 


Dimension Mean S.D. 


53.0 
54.0 


“Consideration” 
“Initiating Structure” 


It can be seen that supervisors above the 
rank of foreman scored significantly higher 
on their “Consideration” attitudes toward the 
foremen than in their “Consideration” atti- 
tudes toward workers. However, the differ- 
ence in the amount of “Structuring” they felt 
should be initiated toward each group is not 
statistically significant. Moreover, bosses 
who scored high in their “Consideration” at- 
titudes toward foremen tended also to score 
higher on these attitudes toward workers 
(r= .58). This was also true for “Initiat- 
ing Structure” attitudes toward foremen and 
worker groups (7 = .73). 

Differences between Work Groups in Their 
Leadership Expectations. An analysis of 
variance was made of the scores derived from 
226 workers, drawn from 73 different work 
groups on the questionnaire “How You Ex- 
pect an Ideal Foreman to Act.” This analy- 
sis revealed significant differences between 
work groups relative to that within work 
groups (F = 14.7, P < .01) in how “consid- 
erate” they expect an ideal foreman to be. 
Apparently, worker attitudes concerning the 
amount of “consideration” desired depends to 
a large extent on the particular work groups. 
However, differences between work groups in 
how much “structuring” behavior they felt 
the foremen should engage in were not signifi- 
cant. This may be due to the small variation 
in scores on this dimension for the total sam- 
ple of workers (sigma = 3.9, see Table 2). 

Relationships with Labor Grievance Rates. 
A problem of future research with these in- 
struments is a well-controlled criterion study 
relating these measures to various criteria of 
group effectiveness in a variety of leadership- 
group situations in industry. The independ- 
ence of dimension scores has special relevance 


Toward Foremen 
Mean 


58.0 64 P< OI 
52.4 7.6 P > 05 


S.D. 


here since each may be differentially related 
to such criteria, depending on the situation. 
Although such a criterion study was beyond 
the scope of the present investigation, corre- 
lations were obtained between some of the 
questionnaires and labor grievance rates in 
23 departments over an eight-month period. 
In this limited study only one correlation 
reached the 1% level of significance (based 
on an N of 23 departments). This was the 
correlation of — .53 between the mean scores 
of feremen in each department on the “Con- 
sideration” dimension of the form “What 
Your Boss Expects of You.” The correlation 
with the “Initiating Structure” score of this 
form was .32. The trend was for depart- 
ments with high worker grievance rates to 
be those whose foremen perceived their own 
supervisors as expecting them to lead with a 
low degree of “consideration” and a high de- 
gree of “structuring.” These results, of 
course, are purely suggestive. An adequate 
evaluation of the value of these instruments 
in predicting group effectiveness must await 
additional research. 

The Leadership Opinion Questionnaire has 
been found of value in the evaluation of a 
leadership training course for foremen and 
in the study of certain social factors affecting 
the foreman’s leadership role (1, 3). 


Summary 


The development of questionnaires to meas- 
ure certain aspects of leadership attitudes in 
industry has been described. The question- 
naires were designed to measure two relatively 
independent dimensions of leadership atti- 
tudes. These dimensions were called “Con- 
sideration” and “Initiating Structure.” Vari- 
ous forms of the questionnaires, revised on 
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the basis of a pilot study, were administered 
at various levels in the industrial hierarchy. 
On each questionnaire, the dimensions were 
shown to have sufficient reliabilities, insig- 
nificant intercorrelations with each other, and 
adequate distributions. 

A comparison of the leadership attitude 
scores at four plant levels revealed that the 
higher people were in the plant hierarchy, the 
less “Consideration” they felt the workers 
should get and the more “Structuring” they 
felt should be initiated. The attitudes of the 
foreman group on each dimension fell some- 
where between what the workers expect and 
what their own supervisors expect, but were 
much more like the attitudes of their super- 
visors. 

A comparison also was made between the 
attitudes of the supervisors of foremen toward 
leading foremen and toward leading workers. 
The results showed that these supervisors 
scored significantly higher in the amount of 
“Consideration” they felt should be shown 
the foremen relative to that shown to work- 
ers, but no significant differences in their 
“Structuring” attitudes toward each group. 
High correlations were found between these 
attitudes of supervisors toward foremen and 
toward workers on both dimensions. 

With reference to the workers’ attitudes 
concerning the amount of “Consideration” 
they would like in an “ideal foreman,” the 
results indicate this depends to a large ex- 
tent on the particular work group. There 
were significant differences between work 
groups relative to that within work groups in 
the amount of “Consideration” desired, but 
insignificant differences with respect to the 
amount of “structuring” desired. The work- 
ers as a whole were more homogeneous in 
their attitudes about this latter dimension. 

Based on limited data, it was found that 
departments with high worker grievance rates 
contained foremen who perceived their own 
supervisors as expecting them to lead with a 
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lower degree of consideration and a higher 
degree of structuring. 

It should be stressed that the findings re- 
ported here are regarded as specific to the 
particular plant and the groups of workers 
and supervisors studied. Additional research 
is needed before valid generalizations can be 
made. It is possible that future research will 
indicate that combinations of measures of 
such things as group characteristics, needs 
and expectations, leadership attitudes, behav- 
iors and perceptions, pressures from super- 
visors, etc. can yield more successful predic- 
tions where ordinary testing procedures have 
failed in the complex field of leadership and 
group effectiveness. 


Received August 4, 1952. 
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Productivity and Attitude Toward Supervisor * 


C. H. Lawshe and Bryant F. Nagle 


Occupational Research Center, Purdue University 


Of utmost importance in the study of hu- 
man behavior are the factors which motivate 
individuals. Inquiries into the motivations 
of people and the relations of these motiva- 
tions to performance are being initiated and 
expanded throughout the field of psychology. 
This is especially true in industrial psy- 
chology. 

In the industrial situation psychologists are 
asking, what motivates employees toward pro- 
ductive effort? How do the financial rewards 
of work, the behavior of the supervisor, the 
nature of the job itself, and the goals of man- 
agement affect the effort of employees (4)? 
The most common approaches to these ques- 
tions have been through the use of question- 
naires, interviews and projective techniques 
to determine the attitudes of employees. 
Psychologists are seeking to relate employee 
attitudes to the actual industrial practices 
of paying for services, of supervising, of 
performing the job, and of setting goals that 
will result in the highest productivity. Im- 
plicit in this approach is the assumption that 
productivity is related to employee attitude. 
This assumption is accepted by nearly every- 
one, yet little experimental evidence has been 
presented. This paper is a report on the re- 
lationship between employee attitudes and 
productivity. 


Subjects 


The population used in this study is part 
of the office force of a large industrial plant 
and is divided into a number of departments. 
Since some of the departments in the plant 
are small and since some did not participate 
in all phases of the study, this report is based 


*For the past three years the Purdue Research 
Foundation and Louisville Works of the Interna- 
tional Harvester Company have been cooperating in 
personnel research. This report is concerned with 
only one phase of a larger study involving the rela- 
tionships between work group productivity, employee 
attitudes, and supervisory sensitivity to employee 
attitudes. A complete report of the study will ap- 
pear later in the literature. 


on only 14 work groups. Of the 223 non- 
managerial, salaried employees in these 14 
work groups, 208, or 93%, participated in the 
portion reported here. 


Productivity Criterion 


The Rating Procedure. Since each of the 
14 work groups was engaged in a different 
type of activity it was impossible to get com- 
parable objective measures of output. In- 
stead, a paired comparison rating of produc- 
tivity was used. Six executives in the plant 
(1. Works Manager, 2. Training Director, 3. 
Staff Assistant to Works Auditor, 4. Staff 
Assistant to Works Manager, 5. Assistant 
Works Auditor, 6. Works Auditor) were asked 
to indicate those work groups which they felt 
capable of rating. The range of selections 
was from eight to 14. The executives were 
supplied with paired comparison forms and 
instructed to indicate “. . . The department 
in each pair which is, in your opinion, doing 
its job better.” 

Each executive’s ratings were converted to 
standard scores as suggested by Lawshe, Kep- 
hart and McCormick (6). The standard 
scores for each work group as given by the 
odd numbered raters were averaged and cor- 
related with the mean standard scores of the 
even numbered raters. The resulting coeffi- 
cient of .78 was stepped up by the Spearman- 
Brown formula to estimate the reliability of 
the means of all six raters, yielding an r of 
88. 

Measuring Productivity. Previous at- 
tempts of researchers to measure the produc- 
tivity of work groups have generally followed 
two lines. 

1. One approach (4) has sought to find 
situations in which there are comparable work 
groups, groups performing the same kind of 
work under the same conditions. Measures 
of productivity for the various work groups 
can be directly compared, since each group 
is doing the same job with the same equip- 
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ment. In the best known application of this 
approach (4) the productivity measures for 
the various groups had little variability, and 
the resulting relationships with measured at- 
titudes were small. While this approach to 
group productivity has many advantages, it 
is limited by the rarity with which one finds 
a number of work groups comparable in size, 
work performed, physical working conditions, 
equipment, and financial rewards. 

2. Another approach (2) has sought to 
determine how well each work group meets its 
output quota. A “normal” level of output is 
set for each work group, and the group’s 
relative productivity is represented by the 
actual output divided by the normal level 
prescribed. This approach is limited not only 
by the validity of the output levels prescribed, 
but also by the fact that the productivity of 
many work groups, especially in an office 
situation, can rarely be measured in physical 
units turned out, either because it is not the 
job of the particular group to process so many 
of this or that, or because the work output is 
regulated by the activities of other depart- 
ments. 


Rating Limitations. The use of a rating 


approach to work group productivity as util- 
ized in this study also has its limitations. It 
has the prime limitation of any rating system; 
one does not know for sure what the raters 


really had in mind when they rated. In this 
case an effort was made to stress the relative 
performance of the work groups. Verbal as- 
sociation of a supervisor with his particular 
work group was avoided in an effort to mini- 
mize the influence of supervisor personality 
in the ratings. It is the feeling of the authors 
that the raters thought of the various work 
groups as functioning entities which were 
there to serve the organization. How little 
trouble the work group caused, whether or 
not it had the answers when called upon, 
whether or not it could cope with rush situa- 
tions, and similar considerations are believed 
to have been the prime factors in the ex- 
ecutives’ ratings. 


Attitude Toward the Supervisor 


The Questionnaire. Nearly all sffice em- 
ployees of the plaat filled out a tailor-made 
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attitude questionnaire. The questionnaire in- 
cluded 21 items about the individual em- 
ployee’s immediate supervisor so as to provide, 
to some extent, a diagnostic view of the 
supervisor as well as a total score representing 
the employee’s opinion of the supervisor. 

Item Selection Procedure. Using a primary 
group composed of 50% of the participating 
employees, the 21 questions were scored by 
giving a weight of 1 to the most favorable 
response and 0 to the other one, two or three 
responses. Total scores were computed for 
each employee. On the basis of total score 
the primary group was divided into a high- 
scoring half and a low-scoring half. Then the 
per cent of the high-scoring half giving the 
most favorable response to each of the 21 
items was computed. The significance of the 
difference between these two per cents was 
computed for each item by means of the 
Lawshe-Baker nomograph (5). Two items 
were discarded by this process. All remaining 
items on the survey were also processed in the 
manner. Three new items were added to the 
19, making a total of 22 items measuring em- 
ployee attitude toward the supervisor. 

Scale Reliability. Each questionnaire in 
the holdout group was scored on the 22 items. 
Separate total scores for each employee were 
computed for the 11 odd numbered items and 
for the 11 even numbered items, and these 
were correlated. The resulting coefficient of 
.865, when stepped up for 22 items, yielded a 
scale reliability of .92. 

Individual employee scores on this scale 
ranged from 0 to 22. This score may be 
easily interpreted as the number of questions 
which the employee answered in the most 
favorable manner. Average scores for atti- 
tude toward each supervisor were computed 
and ranged from 8.8 to 19.3. The average 
score of the 14 work groups toward their 
supervisor was 13.9 and the standard devia- 
tion 3.2. 

The Attitude Dimension. The content of 
the 22 items is important to an understand- 
ing of what was being measured by the scale. 
The questions covered many aspects of the 
supervisor’s behavior as perceived by the em- 
ployees, including such things as, does he: 
give you straight answers, avoid you when he 
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knows you want to see him about a problem, 
criticize you for happenings over which you 
have no control, delay in taking care of your 
complaints, keep you informed, give you 
recognition, show interest in your ideas, 
follow through on his promises, explain to 
you the “why” of an error to prevent recur- 
rence, give you sufficient explanation of why 
a work change is necessary, etc. 


Results 


Correlation. The average rating of each 
work group on how well it was doing its job 
was correlated with the average attitude score 
in the work group toward the supervisor. 
The Pearson coefficient was .86. With 12 
degrees of freedom, a correlation of .661 is 
significantly different from zero at the 1% 
level of confidence. Figure 1 provides a visual 
indication »f the relationship between the 
two variables as well as an indication of the 
general dispersion of each variable. 

Interpretation. In the interpretation of 
this very high relationship between rated 
productivity and employee attitude toward 
supervisor, caution must be exercised. It is 
all too easy to fall into the error of cause and 
effect thinking. On the basis of this study it 
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AVERAGE EMPLOYEE ATTITUDE 
TOWARD SUPERVISOR 


Fic. 1. Scatter diagram showing relationship be- 
tween productivity of departments as rated by six 
executives and average employee attitude toward 
supervisor. The correlation between the variables 
is .86. 


can be concluded only that the behavior of 
the supervisor, as perceived by the employees, 
is highly related to the productivity of the 
group as perceived by higher management. 

The literature has long been replete with 

statements as to the influence of the super- 
visor on group output. French says, “Lead- 
ership has long been regarded as the most 
important factor in group effectiveness . 
(1, p. 475). He points out that, “Since the 
manipulation of (or allowance for) variables 
related to morale is in institutional groups 
primarily the responsibility of appointed 
leaders, the factor of leadership assumes cen- 
tral significance’ (1, p. 485). If the basic 
assumption that the attitudes of people exert 
a great influence over their performance is 
true, then it follows that the leader has within 
his power a means by which he can influence 
the output of the group. 

Results similar to those reported here have 
been obtained in the Prudential study. 
Katz (3) lists a number of variables in super- 
visory behavior which were related to the 
productivity of the work group. It was found 
that supervisors of high productivity groups 
placed less direct emphasis on production as 
the goal, encouraged worker participation in 
making decisions, were more employee 
centered, and spent more time in supervision 
and less in production work. The only em- 
ployee attitude in the study positively related 
to productivity was pride in the work group. 
However, employee attitude toward super- 
visor was not mentioned in the report. In 
view of the types of supervisory behavior 
which were found to be related to produc- 
tivity, it might be inferred, however, that 
employee attitude toward this supervisory 
behavior would also have been so related. 


Summary 


A measure of relative productivity of work 
groups in doing their jobs was obtained by 
having six executives rate the work groups 
by the paired comparison system. An atti- 
tude questionnaire was administered to the 
employees of these work groups. From this 
questionnaire 22 items were used to measure 
employee attitude toward the 
supervisor of the work group. 


immediate 
The correla 
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tion between the executives’ rating of the 
productivity of the work groups and the 
employee’s attitude toward the supervisor was 
.86. This relationship substantiates the 


hypothesis that the supervisor’s behavior, as 
perceived by the employees, is highly related 
to the output of the work group. 


Received April 14, 1953. 
Early publication. 
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During the past several years, the Indus- 
trial Relations Center of the University of 
Chicago has been engaged in the development 
of the SRA Employee Inventory, an instru- 
ment for assessing employee attitudes.' The 
inventory was developed by a coordinated re- 
search team representing the fields of psy- 
chology, sociology, business, and economics. 
It was the intent of the research team to con- 
struct an inventory which would yield a pro- 
file of scores to reflect the attitude of any 
given group of employees toward the signifi- 
cant factors in the work situation. In addi- 
tion, the inventory was to be so constructed 
that its administration and scoring and the 
interpretation of results could be accom- 
plished with the minimum expenditure of 
time and effort. 

During the developmental stage of the in- 
ventory, the writer investigated experimen- 
tally several problems in test construction. 
This investigation dealt with the way in 
which the profile of scores was affected by: 
(1) the arrangement of the items (random- 
ized vs. categorized items); (2) the number 
of scale intervals (five-point vs. three-point 
scales); and (3) the scoring procedure (un- 
weighted vs. weighted responses). 

The effect on the profile of scores was in- 
vestigated separately for each of these con- 
ditions. In addition, a comparison was made 
of the profiles of scores resulting from six 
possible combinations of item arrangement, 
number of scale intervals, and scoring pro- 
cedure. These six combinations represent 
procedures of an increasing degree of com- 
plexity in the treatment of the employee re- 
sponses. The object was to identify the sim- 
plest procedure which could be used without 
loss of information. 


1 Published by Science Research Associates, Inc., 
Chicago. 


The Problem 


Randomized vs. Categorized Items. In an 
inventory or test in which groups of items are 
combined to yield sub-test (category) scores, 
the items may be presented either in random 
order or grouped under the category headings 
to which they belong. The question arises 
as to whether or not the grouping of items 
will affect the profiles of category scores. In 
other words, do the test items yield different 
profiles of category scores when they are 
grouped together or categorized in the in- 
ventory than when they are randomized 
throughout the inventory? 

Five-Point vs. Three-Point Scale. The 
number of intervals which can be used effec- 
tively in any inventory or test is a function 
of such conditions as the degree to which the 
attribute to be rated can be objectively de- 
fined, the degree of skill possessed by the 
raters in the use of rating scales, and their 
interest in making the ratings. The question 
arises as to whether or not the use of a five- 
point scale would yield a profile of category 
scores which was different from that obtained 
with the three-point scale. 

Unweighted vs. Weighted Responses. When 
the five-point scale is used, the further ques- 
tion arises as to whether or not the profile of 
category scores will be affected if the re- 
sponses in the extreme scale intervals are 
given twice the weight of the responses in the 
scale intervals immediately preceding them. 

The General Problem. Six procedures for 
the treatment of employee responses result 
from the combination of the three conditions 
discussed above. These are as follows: (1) 
a three-point scale with categorized items; 
(2) a three-point scale with randomized 
items; (3) an unweighted five-point scale 
with categorized items; (4) an unweighted 
five-point scale with randomized items; (5) 
a weighted five-point scale with categorized 
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items; and (6) a weighted five-point scale 
with randomized items. 

The hand scoring of an inventory utilizing 
a three-point scale and categorized items need 
involve only a count of the number of ac- 
ceptable responses in groups of consecutive 
items. Under these conditions the profile of 
category scores is immediately available. It 
is self-evident that an inventory composed of 
randomized items or one in which responses 
to items must be made in terms of a scale 
having a larger number of intervals, espe- 
cially if the extreme intervals are weighted, 
would require a _ proportionately greater 
amount of administration and scoring time. 
The general problem, therefore, is to deter- 
mine whether or not the simplest procedure 
(use of the three-point scale with categorized 
items) yields a profile of category scores 
which is different from the profile yielded by 
the more complicated procedures. 


The Experimental Design 


A total of 64 items, consisting of state- 
ments descriptive of the work situation, were 
selected for inclusion in a preliminary form 
of the SRA Employee Inventory. These were 
grouped under the following general cate- 
gories: I. Job Demands; II. Working Condi- 
tions; III. Pay; IV. Company Benefits; V. 
Changes on the Job; VI. Friendliness of Fel- 
low Employees; VII. Supervisory Effective- 
ness; VIII. Management and Company Pol- 
icy; IX. Communication; and X. Personal 
Satisfaction on the Job. 

Four forms of the inventory were con- 
structed as follows: 

1. Randomized items to which responses 
were to be made on a three-point scale. 

2. Randomized items to which responses 
were to be made on a five-point scale. 

3. Categorized items to which responses 
were to be made on a three-point scale. 

4. Categorized items to which responses 
were to be made on a five-point scale. 

Each of the four forms of the inventory 
was administered to a separate group of em- 
ployees at a retail store of a large merchan- 
dizing organization in Chicago. These groups 
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of employees were approximately equal and 
were randomly selected from a total experi- 
mental population of 454 subjects. 

For the two forms in which the three-point 
scale was used, the employee was required to 
indicate whether he agreed, was undecided, 
or disagreed with each statement (i.e., in- 
ventory item). About half the items in each 
category were company oriented, and half, 
anti-company oriented. If an employee 
agreed with a company-oriented item, e.g., 
“Management here is really interested in the 
welfare of employees,” it was regarded as a 
“Favorable” response (i.e., favorably oriented 
toward the company). If he disagreed with 
such an item, it was regarded as an “Unfa- 
vorable”’ response. The converse held with 
respect to the items that were anti-company 
oriented. 

Essentially the same procedure was fol- 
lowed also for the two forms in which the 
five-point scale was used. However, since 
the five-point scale provided the employee 
with the opportunity to indicate, if he so 
chose, that he strongly agreed or strongly 
disagreed with an item, there were two addi- 
tional types of response. These were re- 
garded as indicating a “Very Favorable” or 
a “Very Unfavorable” orientation toward the 
company. 


Results 


The inventory yields a profile of ten cate- 
gory scores. Each category score is the per 
cent of favorable responses made by the group 
to the items in the category. It is regarded 
as a measure of the positive feeling held to- 
ward the company by the group. The per 
cent of favorable responses rather than the 
number of favorable responses was used be- 
cause the number of items varies from cate- 
gory to category. The specific formulae em- 
ployed in the calculation of the per cent 
favorable response (P.F.R.) are given below. 

Three-Point Scale (Categorized or Ran- 
domized Items) — 


100F 


P.F.R. = Nx. 
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Unweighted Five-Point Scale (Categorized C’. Three-point scale with categorized 
or Randomized Items) — items. 


100(F + VF) It can be seen by inspection of Figure 1 

VXI. that all the profiles exhibit a great similar- 
ity in shape, though the profiles in which 
weighted responses were used occur at a lower 
level on the scale. 

100(F + 2VF) A quantitative measure of the similarity 
i INXI between the profiles was obtained by calcu- 
lating the product-moment correlation coeffi- 
cients between the sets of category scores. A 
comparison of the profiles obtained from ran- 
domized and from categorized items, when 
the possible effects of the number of scale 
intervals and the method of scoring are con- 
stant, is obtained by comparing profile A with 
A comparison 
of the profiles obtained from the three-point 
and the five-point scales, when the possible 

Ran TE EER TKI age effects of the order of appearance of the items 

the present investigation are shown in Figure : é ; 4 
1, where the following identifying symbols ** constant, is obtained by comparing profile 
eee tatn mand: 5 : A with C, A’ with C’, B with ¢ , and B with 
C’. A comparison of the profiles obtained 
Unweighted five-point scale with ran- from the five-point weighted scale and the 
domized items. five-point unweighted scale, when the possible 
. Unweighted five-point scale with cate- effects of the order of appearance of the items 
gorized items. are constant, is obtained by comparing profile 
. Weighted five-point scale with random- A with B, and A’ with B’. For the sake of 
ized items. completeness, the other possible profile com- 
. Weighted five-point scale with catego- parisons were also made, i.e., A with B’, A’ 
rized items. with B, A with C’, A’ with C, B with C’, and 

C. Three-point scale with randomized B’ with C. The results for these four sets of 

items. comparisons are given in Table 1. 


P.F.R. = 


Weighted Five-Point Scale (Categorized or 
Randomized Items) ,— 


P.F.LR 


P.F.R. is the per cent favorable response, 

F is the number of “Favorable” responses, 

VF is the number of “Very Favorable” re- 
sponses, 

N is the number of persons in the group, 
and 


I is the number of items in the category. 4’ B with B’. and C with C’. 


The profiles of the per cent favorable re- 
sponse which result from the six procedures in 


Table 1 
Product-Moment Correlation Coefficients Between the Ten Category Scores in the Six Profiles 
Obtained from the Different Procedures in the Treatment of Employee Responses 
Randomized vs. Categorized Items 
Profiles Compared AA’ BB’ fag 
r .96 97 97 
Five-Point vs. Three-Point Scale 
Profiles Compared AC . y, beg BC 
r 97 99 98 
Unweighted vs. Weighted Responses 
Profiles Compared AB A'B’ 
r 99 O® 


Other Possible Comparisons 
Profiles Compared AB’ A'B 
r 95 95 95 





Melany E. Baehr 


I 74.0 
x 


10 20 30 40 50 60 70 80 90 
Per Cent Favorable Response 


A. Five-Point Scale Randomized Items (Unweighted) 


1 423 
i} 
Il 


Oo 0 2 30 50 60 70 80 90 
Per Cent S Rocenbhe Response 


B, Five-Foint Scale Randomized Items (Weighted) 





1 680} 
u 696 
mn 487 | 


iv = 673 | 


&v ea3{ 
vi 730 | 























wu 7a5 | 
1x 70 2] 
x A. 4. 4. ’ o. a ." i. 


°o oO ® DO 30 60 70 80 930 
Per Cent Fo Response 


C. Three-Point Scale Randomized Items 
Fic. 1. 























= 











688] 
73.6 
739 

827 
a 


60 80 go 
*Per Cont 5 tenneaite y & AWW 


A’. Five-Point Scale Categorized Items (Unweig hited) 























7) 4 — 








A. 
ow 























O10 20 30 40.50 60 70 8 90 
Per Cent Favorable Response 
B‘ Five-Point Scale Categorized Items (Weighted) 


1 672 
Nn 70.2 
ill 
IV 
642 
vl 69.5 
ugk 69.6 
Mall 776 
I 670 
> 4 


o a Oo SO 60 70 80 90 


30 4 
Per Cent Favorable Response 
C’ Three-Point Scale Gategorized Items 


Profiles showing the per cent favorable response to the ten categories in the 


Employee Inventory. There is one profile for each of the six procedures employed in the 


treatment of the employee responses. 


The high correlation coefficients in Table 1 
indicate that the profiles are similar in shape, 
or in other words, that there is a linear rela- 
tionship between the sets of category scores 
contributing to the profiles. The sets of cate- 
gory scores contributing to the six profiles A, 
A’, B, B’, C, and C’ were also compared with 
respect to their variances. Application of 


Bartlett’s test of homogeneity of variances * 
gave a chi-square of 1.291 with 5 degrees of 
freedom, yielding a P value of .94. 

The results indicate, therefore, that the six 
profiles are highly similar with respect to both 
their shape and the variability of their cate- 


2 Snedecor, George W. Statistical methods. Ames, 
Iowa: The Iowa State College Press, 1946, p. 249 ff. 
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gory scores. The profiles obtained from the 
weighted five-point scale are at a different 
level, but can readily be converted to the 
same level as the remaining profiles by apply- 
ing a constant stretching factor to the cate- 
gory scores. 

The high correlations between the profiles 
indicate that only minor variations occur in 
the individual category scores. Such minor 
variations would not affect the interpretation 
of the profile as a whole. 


Summary 


A comparison was made of the profiles of 
category scores obtained from six different 
procedures in the treatment of the responses 
to items in an inventory designed to reflect 
the attitude of industrial employees toward 
the significant aspects of the work situation. 
These six procedures represented progres- 
sively increasing complexity in the arrange- 
ment of the items, the rating scales employed, 
and in the method of scoring the responses. 

Comparison of the relevant profiles showed 
that: 

(1) Almost identical profiles were obtained 
from randomized and from categorized 
items, 

(2) Almost identical profiles were obtained 
from five-point and from three-point 
scales, and 
Almost identical profiles were obtained 
from unweighted and from weighted 
responses. 
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Finally, all possible comparisons were made 
between the six profiles studied in this investi- 
gation. The 15 product-moment correlation 
coefficients which resulted ranged from + .94 
to + .98 (N= 10). This indicated that the 
profiles are highly similar in shape. Applica- 
tion of Bartlett’s test of homogeneity of vari- 
ances indicated that the profiles were similar 
with respect to the variability of their cate- 
gory scores. It can be concluded, therefore, 
that the use of the simplest procedure, i.e., 
the three-point scale with categorized items, 
results in a profile of scores which would be 
interpreted in exactly the same way as those 
resulting from the other five, more compli- 
cated procedures. 

It is clear that the use of the simplified 
procedure will result in considerable savings 
in time, labor, and costs involved in the ad- 
ministration and scoring of the inventory. 
From a practical standpoint, therefore, this 
investigation points up the desirability of 
running pilot studies to determine whether 
or not a simpler form of a test or inventory 
will yield any less information than more 
complicated ones for specific subject popu- 
lations. This is especially true when, as is 
often the case in industrial and educational 
institutions, an inventory which is once ac- 
cepted is likely to be routinely administered 
to thousands of cases year by year. 


Received July 14, 1952 
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Effectively using psychological testing to 
aid in selecting supervisory personnel presents 
an extremely important problem in motiva- 
tion. The question is whether there are dif- 
ferences in motivation in taking tests for re- 
search or for actual promotion purposes. If 
there are motivational differences between 
taking tests for research and for keeps, which 
basis of motivation will elicit test responses 
that more clearly reflect the individual’s actual 
aptitude? 

Method 


The writer had an opportunity to check 
this with a sample of 40 supervisors who vol- 
unteered initially to participate in a testing 
program aimed at obtaining for research pur- 
poses a measure of the qualities and charac- 
teristics identifying the group as a_ whole. 
The supervisors were randomly divided into 
two groups of 20 each. Rough comparability 
was obtained in age, education and experience 
since differences between these means and 
sigmas did not exceed the .05 level of sig- 
nificance. The two groups identified as 1 
and 2 were given the Wonderlic Personnel 
Test Form A. 

Three months later the same two groups 
were given Form B, but supervisors in the 
Control Group 1 were encouraged to co- 
operate for purely research purposes while 
supervisors in the Experimental Group 2 were 
asked to cooperate for the purpose of giving 
management additional information for de- 
termining whom among them to promote to 
higher supervisory levels. 

In order to determine which basis of moti- 
vation elicited test scores more nearly repre- 
sentative of actual aptitude, a criterion of 
over-all performance was obtained. Superiors 
knowing each supervisor in Group | ranked 
them from best to poorest on over-all per- 
formance as defined in a training session. 
The same procedure was followed in evaluat- 
ing supervisors in Group 2. A reranking of 
each supervisor in both groups three months 


later showed the criterion to have an esti- 
mated + .89 reliability. Correlations be- 
tween test scores and criterion for Groups 
1 and 2 for both testing situations were ob- 
tained by the rank-differences method. 


Results 


Table 1 shows the mean scores and sigmas 
for Group 1 and 2 with respect to Form A 
and B of the Wonderlic Personnel Test. 

Whereas the differences in means and sig- 
mas were not significant between the first and 
second testing for the Control Group 1, the 
Experimental Group 2, believing their per- 
formance at the second testing would affect 
their opportunity for promotion, increased 
their mean score almost seven points. 


Table 1 
Scores of the Wonderlic Tests 


Group 1 
(N = 20) 


Group 2 
(N = 20) 
Form A 
Means 19.1 19.9 
Sigmas aa 5.0 44 


Form B 
Means 20.0 26.6 6.63 
Sigmas 5.7 64 63 


* Differences computed before rounding to one deci 
mal place. 

t Indicates significant difference beyond .05 level of 
confidence. 


However, did supervisors in both Groups 
1 and 2 maintain comparable scores in the 
two testing situations? The correlations by 
the rank-differences method between first and 
second testings were + .76 and + .39, re- 
spectively, for Groups 1 and 2. The former 
but not the latter is significantly greater than 
zero since it exceeds the .05 level of con- 
fidence. 

Generally, supervisors in Group | main- 
tained comparable absolute and_ relative 
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scores in both testing situations. Supervisors 
in Group 2 did not maintain absolute and 
relative scores when advised that promotions 
would be based on test performance. Inspec- 
tion revealed that several supervisors changed 
rank-positions from highest to lowest and in 
two cases rank values changed while nu- 
merical scores did not. 

The correlations between test scores and 
criterion for Groups 1 and 2 were, respec- 
tively, + .41 and + .34 for the first testing 
and + .37 and + .67 for the second testing. 
Only the last correlation is significantly 
greater than zero since it exceeds the .05 level 
of confidence. 

These data tend to indicate that an in- 
significant relationship existed between test 
scores and criterion of over-all performance 
when the tests were administered for purely 
research purposes. However, changing the 
basis of motivation from that of research to 
that of promotion purposes brought about 
a highly significant relationship between test 
scores and criterion. 

It might be interesting to mention that 
two men from Group 2 were actually pro- 
moted since the several supervisors up for 


consideration were just by chance in Group 2. 
However, their test scores were not helpful 
in deciding which of the several to promote 
since all of their scores on the second test 


were fairly high. But had scores on the first 
test, given for purely research purposes, been 
used to aid management in promoting two 
supervisors, it is doubtful that the two ac- 
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tually selected would have been since they 
had two of the lowest scores in their group. 


Summary 


The problem of whether there are differ- 
ences in motivation in taking tests for re- 
search or for promotion purposes was studied 
by giving to a group of supervisors two forms 
of the Wonderlic Personnel Test with a time 
interval between for research purposes. A 
second group took the same two forms but 
the second administration was with reference 
to possible promotion. The following results 
were obtained: 

1. The promotion motivation produced sig- 
nificant increases in the mean score whereas 
the control group showed no such increases 

2. The promotion motivation changed the 
individual's relative standing in the experi- 
mental group as shown by the lower correla- 
tions between the two tests than occurred in 
the control group. 

3. Scores motivated by promotion purposes 
had greater validity as indicated by correla- 
tions with a criterion based on 
over-all performance. 

Although it is very difficult to draw general 
conclusions, the implications of this study 
should serve to sound a note of caution to 
others doing research on aptitude tests in 
industry to take special pains to control the 
factor of motivation. 


ratings of 
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Engineers and technically trained person- 
nel are key figures in meeting unprecedented 
demands of our armed forces, defense indus- 
try, and our civilian economy. As a result, 
the country faces a critical shortage of engi- 
neering personnel. In the year 1949-1950, 
a total of 57,159 (20) engineers were gradu- 
ated from the nation’s technical schools. 
These graduates were rapidly absorbed by 
industry. Against an estimated annual re- 
quirement for 30,000 new engineers, the 
yearly crop of engineering graduates, how- 
ever, is rapidly declining. Thus from a total 
of 38,000 graduated in 1951, the estimated 
number of graduates falls to 17,000 for 
1954 (19). 

In view of these figures, the selection of 
engineering students to continue in pursuit 
of graduate degrees becomes a problem of 
primary importance. It is undesirable that 
technical manpower be wasted in the unsuc- 
cessful pursuit of advanced training. In like 
manner, it is important that educators be 
able to identify the most able students in 
order that they may be urged to pursue grad- 
uate work. It has long been recognized by 
engineering faculties that wise selection of 
advanced students would be facilitated by 
development of a short, easily administered 
test with demonstrated validity for the as- 
sessment of potentialities necessary to suc- 
cess in graduate school. This article presents 
the rationale for the use of a special analogies 
test in this assessment task and constitutes a 
description of an exploratory attempt to build 
such an instrument. 

Only within recent years have systematic 
efforts been made to predict success in engi- 
neering curricula. Usually such investiga- 
tions have not met with the degree of success 


* This study was completed while the author was 
Teaching Assistant in the Institute of Technology, 
University of Minnesota, 1950-51. It was completed 
in partial fulfillment of M.A. requirements in_per- 
sonnel psychology. The author wishes to acknowl- 
edge the assistance and guidance given him by his 
advisor, Professor Donald G. Paterson 
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enjoyed by projects designed to develop pre- 
dictive devices in other fields of academic en- 
deavor. However, studies (1, 3, 4, 9, 11, 16, 
17) concerned with the prediction of success 
in undergraduate engineering have uniformly 
shown certain measures to be of maximum 
utility. It would appear that an ideal com- 
binational measure for the evaluation of a 
person’s aptitude for engineering would in- 
clude measures of previous academic achieve- 
ment (1, 4), general intelligence (16), and 
facility in mathematics (1, 3, 4, 9, 11, 17). 

The problem with which this investigation 
was most concerned (i.e., the evaluation of 
graduating engineers) has been little recog- 
nized in the literature. The Graduate Rec- 
ord Examination has been used extensively 
in several fields, but little information relat- 
ing performance on the G.R.E. to achieve- 
ment in advanced engineering training is 
available. Learned (5, 6) in discussing the 
relative merits of the G.R.E. states that the 
theoretical approach of the G.R.E. (i.e., the 
testing of information gleaned from a variety 
of subject matter fields) has proved success- 
ful in prediction at the higher levels of aca- 
demic endeavor. Speer (13), on the other 
hand, feels that the broad generality of the 
subject matter tested by the G.R.E. is the 
very factor which makes it unsuitable for use 
with engineers. He emphasizes that selection 
of capable engineering graduate students 
must include a measure of general mental 
ability as well as measures of achievement 
in previous work. 

In general, little of a definitive nature has 
been done in the evaluation of graduating 
engineers. There is an indication that suc- 
cess in postgraduate employment is related 
to undergraduate grades (10). It is felt that 
proficiency in graduate school can be pre- 
dicted best by a test combining a measure of 
general intelligence with some measure of 
previous achievement (13). One relevant 
study (15) has shown that tests requiring 
the ability to perform abstract reasoning are 
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efficient predictors of success in advanced 
study in the physical sciences. 

Experience with the verbal analogy as a 
test item has shown it to have characteristics 
which would appear to make it an efficient 
device for use in evaluating engineering grad- 
uates. The verbal analogy item is short. A 
test including many such items may thus be 
easily administered in a brief period of time. 
Because the analogy requires the perception 
of relations and the generation of correlate 
relations, it is a measure of abstract intellect 
(12). Furthermore, although it is related to 
verbal facility, it has also been shown to be 
associated (r = .67, .68) with measures of 
arithmetic reasoning and arithmetic compu- 
tation (18). Factor analyses of verbal anal- 
ogies tests (14) have indicated high loadings 
in V (verbal) and D (deductive) factors. 
The latter factor is most prevalent in tests 
calling for arithmetic reasoning, and number 
series completion, abilities which are impor- 
tant in predicting success in engineering train- 
ing. 

In the construction of analogy items, it is 
possible to include concepts calling for knowl- 
edge in specific subject matter fields. Thus 
analogies tests may be used to measure pre- 
vious achievement. Levine's study (7, 8) 
bears particularly on this point. He devel- 
oped an analogies test specific to the subject 
matter of psychology. He found his test to 
be a slightly better predictor of achievement 
in psychology courses than a test of general 
ability such as the Miller Analogies. He con- 
cludes, “At any rate the data obtained in this 
project would tend to indicate the feasibility 
of exploring the possible uses of special anal- 
ogies tests in other fields” (8, p. 305). 

Thus, a special analogies test involving en- 
gineering knowledge and concepts may be an 
efficient instrument in measuring capabilities 
necessary to success in graduate engineering.’ 
Because of this, such a test was constructed 
and used in this exploratory evaluation of 
engineering graduates. 

1Tt may turn out that this type of test would be 
even more important in the selection and placement 
of engineers (sales, research, design, electrical, etc.) 
in business and industry. For this reason, the test 
here reported has been extended and is now being 


validated in a number of established engineering re- 
search departments 


The Present Study 


The purpose of this study was to build a 
test applicable to all fields of engineering. 
This decision necessitated drawing items from 
that store of information which can accurately 
be said to comprise the “common-core” of 
academic knowledge among graduating engi- 
neers. An analysis of the curricula in 14 
engineering colleges indicated the so-called 
“common-core” to consist of courses in in- 
organic chemistry, analytic geometry, trigo- 
nometry, algebra, differential and integral 
calculus, physics, hydraulics, statics and dy- 
namics, strength of materials, thermodynam- 
ics, and a survey of the basic principles of 
electrical engineering. Items were written for 
each of these subject matter fields. Minute 
details were avoided; only important princi- 
ples basic to the fields were included. By so 
doing, it was hoped that esoteric informa- 
tional content would be ruled out as a deter- 
miner of item difficulty. From the initial 
pool of 135 items, 90 were selected for pre- 
liminary administration. The following are 
examples of the analogies * which were used: 


Consider a triode: 

Spectators: turnstile: :plate current: 

(1.) cathode; (2.) plate; (3.) anode; 
(4.) grid. 

Pauper:money::riveted butt joint: 

(1.) bearing stress; (2.) bending stress; 
(3.) tensile stress; (4.) shearing 
stress. 

Diameter:circumference::y = bx: 


(1.) #+y=PFr'; 
ee + 


- Re. th) — = 1. 


” le “¢€# @® 


(2.) x7 = 2py; 


(3.) 


The 90-item test was administered to 203 
engineering seniors enrolled in G.E. 103, a 
survey course of engineering ethics, which is 
required of all graduating seniors in the In- 
stitute of Technology at the University of 
Minnesota. 

Of the 203 seniors who took the prelim- 
inary form of Minnesota Engineering Analo- 
gies Test, only 91 completed every item on 

’ The correct response for the first example is (4.) 


grid; for the second is (2.) bending stress; and for 
the third is (1.) P+ y =r’. 
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the test. This was because of the limited 
time afforded (50 minutes) by the length of 
the class period. Each student’s score was, 
therefore, expressed in terms of an accuracy 
index derived by dividing the number of items 
answered correctly by the number answered 
incorrectly. A scatter diagram portraying the 
relation between the number of items at- 
tempted and the accuracy index indicated 
that slow workers worked just as accurately 
as more rapid workers. Thus it was indi- 
cated that ability to finish the test was not 
related to a student’s proficiency in the test. 
Therefore, it did not seem desirable to include 
the speed factor in the analysis of results. 
For item analysis purposes then, the scores 
were expressed in terms of the accuracy 
index. 

Davis’ item analysis chart (2) was used 
to compute biserial validity coefficients. Two 
indexes were computed for each item. The 
first validity index was based on an “internal” 
criterion, the accuracy index. The second 
validity index was based on an “external” cri- 
terion consisting of the over-all honor point 
average * earned at the University of Minne- 
sota. A total of 63 items exhibiting validity 
coefficients above .10 on both criteria were 
combined into a final form of the test. This 
final form was then administered to 53 grad- 
uate students in engineering. 


Results 


Among the 203 seniors, the correlation be- 
tween honor point average and the accuracy 
index for the 90-item test was .57. A corre- 
lation of this magnitude was considered en- 
couraging in view of the fact the test still 
contained many poorly discriminating items. 

Table 1 shows distributions of the validity 
coefficients obtained in the item analysis. 

The corrected odd-even reliability of the 
63-item test administered to the graduate stu- 
dents was .86. A reliability of this magni- 
tude compares favorably with the reliability 
coefficients of some of the more widely used 
standardized tests. The distribution of re- 
sponses made by the graduate group showed 


3’ The honor point average is calculated on the ba- 
sis of three honor points for each credit of A, two 
for B, one for C and zero for either D or F. 
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Table 1 


Magnitudes of Validity Indexes Obtained 





Internal Consistency External Validity 
Criterion (R/W) Criterion (HPA) 
Number of 

Items 


12 
14 


Biserial Number of 
r Items 


<0 4 
0-10 12 
11-.20 15 26 
.21-.35 28 32 
> 35 31 6 


Total 


90 90 


that many of the distractors, apparently ade- 
quate within the senior group, failed to func- 
tion effectively for graduate students. How- 
ever, for the graduate group, the average 
Davis Difficulty Index (2) was 57. This 
value corresponds to a proportion of successes 
of .63. This indicates that in spite of the 
shrinkage of distractor effectiveness, the test 
was moderately difficult for these high ability 
students. 

For purposes of comparison, the tests of 
the seniors who finished the test were re- 
scored for the 63 items of the final form. 
Figure 1 shows the distribution of scores for 
the two groups (seniors and graduate stu- 
dents) on this form. The graduate students 
scored markedly higher having a mean of 
37.1 and S.D. of 7.24 compared with the 
senior mean of 28.7 and S.D. of 7.18. The 
critical ratio was 6.76. 

In terms of overlap, only 13 per cent of the 
seniors exceeded the median of graduate stu- 
dents, and only 9 per cent of the graduate 
students fell below the median of the seniors. 
The low amount of overlap is a definite indi- 
cation that the test operates in a valid man- 
ner to identify the more able engineering 
students. 

But to what extent does the test differen- 
tiate among graduate students with different 
abilities? In order to investigate this ques- 
tion, the graduate student group was divided 
into first, second, and third year students ac- 
cording to the following plan: Ist year—O-3 
quarters; 2nd year—3-—6 quarters; and 3rd 
year—more than 6 quarters. 
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Table 2 


Differential Performance of Seniors and 


Group 
Seniors 
Ist Year Grad. Students 
2nd Year Grad. Students 
3rd Year Grad. Students 


Figure 2 shows the distribution of scores 
within each of these three groups. The first 
and second year students showed similar per- 
formance, but the third year students exhib- 
ited marked superiority. These results are 
important when it is remembered that third 
year students include only the carefully 
screened Ph.D. candidates; the other two 
groups consisting of master’s candidates. Ta- 
ble 2 summarizes the performance on the 63- 
item test of all four groups (seniors, Ist, 2nd, 
3rd year graduate students). In terms of 
overlap, only 13 per cent of the students in 
the first two years of graduate school reached 
or exceeded the median of 3rd year students. 
In like manner, only 19 per cent of the latter 


X 
Xx 
X 
XX X 


xX 
X 
xX 


xX 


Graduate Students on the 63-Item Test 


Standard 


Probability 
Deviation 


Mean Level 
28.7 
34.6 
35.3 


42.4 


7.18 
6.26 
5.63 


7.05 


P< .001 
P = .75 


P< O01 


group fell below the median of the former. 
These results provide further impressive evi- 
dence of the validity of the 63-item test. To 
the extent that candidates for the Ph.D. de- 
gree are, as a group, more able than other 
students in graduate engineering, the ability 
of the test to identify the more competent 
group is established. 

It may be concluded that the exploratory 
use of the special analogy test has proved it 
to be a feasible device for the evaluation of 
engineering graduates. This conclusion is 
based on the fact that a large number of the 
analogy items discriminated sharply between 
academically superior and inferior students. 
Further support is drawn from the finding of 


Graduating seniors 
N = 91 Q: = 

Qe 
QO: = 


xX 
X 
) XX 
X XXXX 
X XXXX 
XX X XX XXXX 


XxX 
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Graduate students 
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Distribution of senior and graduate student scores on the 63-item test. 





Marvin D. Dunnette 


X 
xX 


XXXX X 


First year graduate students 
N=24 Q,=29 
Q. = 34 
Q; = 38 
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Second year graduate students 
N = 13 Q, = 31 
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35 40~—*«SKS Ss 


Third year graduate students 
N = 16 Q: = 36 
Q. = 44 
Qs = 48 
xX 
XX <X 
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10 15 20 25 30 
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significant differences between graduating sen- 
iors and graduate students and between grad- 
uate student groups with varying levels of 
ability. It is suggested that further investi- 
gation of the special analogy test may be 
most profitable in the development of instru- 
ments for the assessment of high level engi- 
neering abilities. 


Summary 


Studies have shown that degree of success 
in engineering studies can be most effectively 
predicted by a combination of measures in- 
cluding previous academic performance, gen- 
eral intelligence, and mathematical facility. 
The nature of the verbal analogy item sug- 
gested that it is peculiarly fitted to the task 
of assessing the above attributes. 

With this in mind, a 90-item engineering 
analogies test was constructed and adminis- 
tered to 203 engineering seniors. Item analy- 
ses were made using internal consistency and 
external validity criteria. The 63 most dis- 
criminating items were combined and admin- 
istered to 53 graduate students. The odd- 


35 40 45 50 55 


Differential distribution of graduate student scores on the 63-item test. 


even reliability of this form was found to be 
.86 for these highly selected graduate stu- 
dents. 

The results for the 63-item test were exam- 
ined with respect to comparisons within and 
between graduate students and seniors. The 
performance of graduate students was mark- 
edly superior to that of the seniors. Within 
the graduate group, the performance of third 
year students (Ph.D. candidates) was supe- 
rior to that of first and second year students 
(M.A. candidates). 

It was concluded that a special analogies 
test effectively assesses engineering abilities. 
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The Humm-Wadsworth Temperament Scale 
was first published in 1935. Its primary pur- 
pose was for use as an aid in industrial selec- 
tion. The scale consists of 318 questions, but 
of these only about half are scored. The 
others are for use as a “‘setting” for the scored 
items. The scale is based upon the Rosanoff 
classification of personality and purports to 
measure seven different components. The 
test was standardized on seven groups of sub- 
jects each representing a relatively pure type 
of that component. A _ highly complicated 
method of scoring and validation has been de- 
vised. Suffice it to say that the split half re- 
liability of the various components varied 
from .70 to .90 and the validity as checked 
against new criterion groups was .85 to .98 as 
reported by the authors (1). The reliabil- 
ities have been rechecked by Dysinger (4) by 
the test-retest method and found to be even 
higher than those reported by the authors. 
Most of the components are independently 
variable; only two components, manic and 
depressive, show high intercorrelations (r = 
88). 

The scale has been widely used with col- 
lege students, psychotic groups, and in indus- 
try. No attempt will be made to review all 
these studies. Reed and Wittman (5) gave 
the scale to 477 Elgin Hospital patients and 
compared the scores with a normal control 
group. Only the normal and cycloid com- 
ponents were significantly different for the 
two groups. Dorcus (3) used the scale with 
an industrial group. He reports that it cor- 
rectly diagnosed 73% of the poor group and 
65% of the superior group. 


In the present study the Humm-Wadsworth 
scale was given to the employees of a relatively 
large industrial organization employing largely 


“white collar” workers. The scale was adminis- 
tered approximately ten years ago and the evalua- 


* Human Resources Research Center, Keesler AFB, 
Miss. 


tion was made about nine years later. The scales 
for 405 employees who constituted those with 
surnames beginning with letters before “I” in the 
alphabet were scored. This should constitute a 
random sample from approximately half the popu- 
lation. Of this group, 191 were still employed 
and rated as “successful” or “satisfactory.” An- 
other group of 139 had withdrawn from the com- 
pany but without any unfavorable service record 
Another group of 75 had either been dismissed 
or resigned while on probation. These are classi- 
fied as “undesirable.” Using a method of score 
evaluation as nearly as possible like that used by 
Humm (2) in his study of Los Angeles police- 
men, and checking the method further, as best 
we could in a personal conference with the au- 
thor, the 405 employees were classified in terms 
of their Integration Index and Component Con- 
trol Measure into five groups—very good risks. 
good, questionable, poor, and very poor. Those 
with Integration Indices and Component Control 
Measures all above 5, for example, were classified 
as very good risks. Those with at least two rat- 
ings as low as 1 were called very poor risks. 


Of the employed group of 191 still with the 
company and doing satisfactory work, 9.4% 
received a very good rating on the test and 
5.7% received a very poor scale rating. Of 
the 139 no longer employed but with no evi- 
dence about their success, 12.2% were rated 
very good by the test and 5.8% as very poor. 
Of the 75 who had been dismissed or with- 
drew for cause, 12.0% were classified as very 
good risks by the scale and 5.4% as very poor 
risks. Thus it is apparent that these results 
show no difference between the three em- 
ployed groups in terms of scores and the 
scale. 

As another method of evaluation, the data 
were arranged in a 3 X 5 table with three 
groups of employees in terms of their work 
record as one axis and five degrees of success 
on the scale in terms of the Integration Index 
and Component Control Measure as the other 
axis. From this table a chi square as a test 
of deviation from the null hypothesis was cal- 
culated. A chi square of 5.93 was obtained. 
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With eight degrees of freedom these data gave 
a p of .65. That is, so great a difference as 
this would occur by chance 65 times out of 
100. When the middle group who had left 
the company between the time of testing and 
the time the study was made was omitted, 
no important change in the relationship was 
apparent. 

Inspection of the seven components of the 
test showed no component in which there was 
a significant difference between the satisfac- 
tory and unsatisfactory employees. An ex- 
amination of the Integration Index which is 
a summary value obtained from the test also 
showed no difference these two 
groups. 

The failure of the test to differentiate the 
satisfactory from the unsatisfactory workers 
by any of the above methods may be due to 
any one or a combination of the following: 
(1) the test may not adequately measure 
the components it purports to measure; (2) 
these components may not be essential ele- 
ments for success in this industry; and (3) 
the company cannot distinguish between sat- 
isfactory and unsatisfactory workers. 


between 


This study does not prove that the Humm- 
Wadsworth scale may not be successful in 
selecting workers in some industries but it 
certainly gave no evidence of success in this 
situation. 


Received August 4, 1952. 
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The Pennsylvania State College 


This study is concerned with the prediction 
of success and failure in the elementary 
courses in French, Spanish, and German at 
the Pennsylvania State College. Predictions 
were made on the basis of scores on the Penn- 
sylvania State College Academic Aptitude 
Examination (3), parts one and two. Sepa- 
rate predictions were made for each of the 
above mentioned languages. 


Procedure 

The Subjects. The subjects in this study 
were all the freshmen in the Pennsylvania 
State College who were enrolled in the ele- 
mentary courses .in French, Spanish, and 
German in September 1951, and had taken 
the Pennsylvania State College Academic 
Aptitude Examination. The total number 
of subjects is 443 divided among the three 
languages in the following manner: (1) 
French—47; (2) Spanish—189; and (3) 
German—207. 

Since the study is directed toward pre- 
dicting success it was felt that freshmen 
would be the best subjects. According to 
Feder (2) the function of prediction in edu- 
cation is to facilitate guidance, and, if it can 
be effective, educational guidance is most val- 
uable when applied to freshmen who are be- 
ginning their academic careers. 

The Criterion. The criterion used for suc- 
cess was the teachers’ grades in the three 
foreign language courses. The grades at the 
Pennsylvania State College range from — 2 
to 3 with the 3 being the highest possible 
grade which can be attained in a course, and 
the — 2 being the lowest failing grade. A 
grade of — 1 is also given and this too is a 
failing grade. Since it was felt that those 
students who received a grade of 0 (the low- 
est passing grade) had not achieved more 


* The author wishes to express his sincere apprecia- 
tion to Dr, William U. Snyder and Dr. Ila H. Geh- 
man under whose direction this study was done. 


than the barest minimum of success, the stu- 
dents who received such a grade were placed 
in the failing group. The composition of the 
groups then becomes as follows: (1) Passing 
—grades 3, 2, and 1; and (2) Failing—grades 
— 2, — 1, and 0. 

The Predictive Instrument. The instru- 
ment used in this study was the Pennsylvania 
State College Academic Aptitude Examina- 
tion, parts one and two. The verbal nature 
of these two parts (vocabulary and paragraph 
reading) does not necessarily indicate that 
they would be particularly useful in the pre- 
diction of success in the study of a foreign 
language. However, it seems that the two 
or more skills or achievements involved in 
these tests might be expected to have a direct 
relationship to language skills. All items on 
both tests are of a multiple choice nature with 
five possible choices. 

Bernard (1) feels that the “learning of a 
foreign language consists fundamentally in 
the acquisition of an additional set of sym- 
bols for old, familiar meanings. . . . Since 
the most pressing need for the student is the 
knowledge of the meaning of these new sym- 
bols, the preponderant importance of vocab- 
ulary becomes at once apparent.” Symonds 
(4) states that “presumably there should be 
some relationship between the size of English 
vocabulary and the ability to learn a new 
language.” 

Considering the importance of vocabulary 
in the learning of a foreign language it is felt 
that the measurement of the skill involved 
in learning a vocabulary will be of some 
value in predicting success in learning a for- 
eign language. Since this cannot be done 
directly, we used a measure of proficiency in 
English vocabulary as being indicative of the 
ability to learn vocabulary, with the ‘feeling 
that the same skills involved in developing 
the English vocabulary are operating in learn- 
ing the vocabulary of a foreign language. 
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The paragraph reading part of the test in- 
dicates not only the subject’s ability to read 
a paragraph, but also his ability to under- 
stand what he has read, as measured by his 
answers to a set of questions concerning the 
subject matter of the paragraph. In being 
able to understand the meaning of a para- 
graph the subject must have an adequate vo- 
cabulary, must be able to learn the meanings 
of new words from their context, and must 
have some knowledge of grammar. These 
skills are of importance also in the learning 
and mastery of a foreign language. 

Experimental Design. The procedure used 
in this study was as follows: The grades of 
all freshmen registered in the courses in ele- 
mentary French, Spanish, and German at the 
Pennsylvania State College were collected 
from the respective language departments. 
The grades were divided into two groups, 
with an equal number of grades of each lan- 
guage in both groups. This was done by 
selecting every other student (in each lan- 
guage group) from a list of students, ar- 
ranged in order of descending magnitude of 
grades, and placing him in one group. The 
other group was composed of the remaining 
students. This method insured an approxi- 
mately equal distribution of grades for each 
group. Of the 507 students in these two 
groups, 443 had taken the Academic Aptitude 
Examination and their scores had been re- 
corded by the psychology department. These 
scores were then collected and the analysis 
begun. 

What was sought was a point, or score, 
on the Academic Aptitude Examination (parts 
one and two) below which are found those 
students who cannot make passing grades (a 
grade of 0 was considered to be a failure for 
reasons previously discussed) in their lan- 
guage courses. 

This cut-off point was determined on one 
of the two groups (many points were tried 
and the one yielding the best prediction was 
selected) and its validity was determined by 
testing its applicability to the students com- 
prising the second group. Different cut-off 
points were located for each language, with 
the expectation that they would be approxi- 
mately the same for all the languages. 
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Each of the two parts of the test was used 
separately in finding the cut-off points, and 
the two tests were also combined and a cut- 
off point on the combined score was located 
for each language. 

Statistical Analysis. ‘The statistic which 
was computed was the significance of differ- 
ence between per cents failing above and be- 
low the various cut-off points which were 
established (¢ test). 


Results 

The vocabulary test proved to be a valid 
differentiator between students who achieved 
success, and students who failed in foreign 
language study. Table 1 gives the results of 
the application of the cut-off scores to the 
test group in each of the three languages. It 
can be seen that the vocabulary test was most 
successful in distinguishing between success- 
ful and unsuccessful students when applied to 
the Spanish group. Here, of the students 
who fell below the cut-off score 76.9% failed, 
while only 35.6% of those exceeding the cut- 
off score failed the Spanish course. This dif- 
ference was significant at the .001 level. Only 
with French is the difference not significant 
at the one per cent level. 

To determine the validity of the vocabu- 
lary test as a predictor, the various cut-off 
scores were applied to the second group (the 
cross validation group). Table 1 shows that 
there was no decrease in the significance of 
difference in the Spanish group, but in the 
other languages a decrease was found. 

The lack of significance in the French group 
can probably be attributed to the compara- 
tively small number of subjects in the test 
and validation groups of this language. The 
effect of these small numbers is revealed in 
the relatively high o’s of difference found in 
these groups. These were found to be almost 
double those of the other groups. This weak- 
ness was found to be operating when predic- 
tions were made with either of the tests as 
well as when the tests were combined to form 
a single predictor. 

The paragraph reading test proved to be 
most efficient in predicting success and fail- 
ure in German. Of the students who fell be- 
low the cut-off score, 63.8% failed German. 
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Among the students who exceeded the cut-off 
score only 23.9% failed. This difference was 
found to be significant at the .001 level. 

In the cross validation group, however, the 
greatest success in prediction was made with 
the Spanish group, and once again the least 
success was found with the French group. 

Combining the vocabulary and paragraph 
reading tests does not yield a marked im- 
provement in prediction of success and fail- 
ure, as will be revealed by an examination of 
Table 1. Especially in the cross validation 
group are the results found to be almost ex- 
actly those attained when the two tests were 
used separately. 

Summary 

The object of this study was to determine 
the efficiency of parts one (vocabulary) and 
two (paragraph reading) of the Pennsylvania 
State College Academic Aptitude Examina- 
tion in predicting success and failure in the 
elementary courses in the modern foreign 
languages. 

The results of this study show: 

1. The greatest success in prediction was 
achieved with the Spanish group. 

2. Success in French was most difficult to 
predict. This was attributed to the small 
number of subjects in this group. 

3. There was very little difference in the 
efficiency of the predictive instruments. The 
combined tests were generally most successful 
and the vocabulary test probably somewhat 
more effective than the paragraph reading. 

This study demonstrated that it is possible 
to predict success and failure in the modern 


foreign languages. It was further demon- 
strated that tests of vocabulary and para- 
graph reading can be used to make this pre- 
diction. 

It is suggested that the college administra- 
tion take the responsibility for selecting a 
method of giving the students with low lan- 
guage aptitude (as measured by any of the 
instruments used in this study) an oppor- 
tunity to derive some value out of foreign 
language study. This would no doubt in- 
volve some special treatment such as can be 
provided by special classes. 

The highly significant results found in this 
study indicate that procedures such as these 
could be applied in schools other than the 
Pennsylvania State College. It is likely, how- 
ever, that each school would find it expedi- 
tious to determine for itself, the most effi- 
cient predictive score. If necessary, individ- 
ual schools could substitute other tests of a 
similar nature with which comparable results 
might be obtained. 


Received July 15, 1952 
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The Ohio State University 


This is a study of 1,244 students, of whom 
78 were women, who took college algebra 
and of their success in subsequent courses in 
mathematics. The study followed these stu- 
dents until 29 remained at the end of seven 
courses in the prescribed sequence. Twenty- 
five of these remaining 29 students had been 
graduated at the time of this writing. 

The seven courses in mathematics making 
up the sequence are briefly described as fol- 
lows: 


421. College Algebra. Five credit hours... . 

422. Trigonometry. Five credit hours... . 
Prerequisite, Mathematics 421... . 

423. Analytic Geometry. Five credit hours. 
. . . Prerequisite, Mathematics 422... . 

441. Calculus. Five credit hours. ... Pre- 
requisite, Mathematics 423. Differentiation of 
algebraic forms, with applications; successive dif- 
ferentiation; differentiation of transcendental 
functions; parametric equations, differentials; 
curvature; theorem of mean value; indeterminate 
forms. 

442. Calculus. Five credit hours... . Pre- 
requisite, Mathematics 441. Integration of stand- 
ard elementary forms, and integration by various 
devices; definite integrals; application to geome- 
try and physics. 

443. Calculus. Five credit hours. . . . Pre- 
requisite, Mathematics 442. Numerical series 
and power series; differential equations; hyper- 
bolic functions; partial differentiation; multiple 
integrals, and applications. 

601. Advanced Calculus. Five credit hours. 
. . . Prerequisite, Mathematics 443. The theory 
of limits, functions, continuity; definition and 
meaning of ordinary and partial derivatives; 
definition of definite integrals, proper and im- 
proper; fundamental theorem of the integral cal- 
culus; functions defined as integrals containing a 
parameter; mean value theorems; convergence of 
series; power series; implicit functions. 

The course in advanced calculus was chosen 
as the culminating point because it is thought 
to represent the type of thinking believed 
to be important in the graduate study of 
mathematics. 

The sample is 1,244 students who were 
enrolled in college algebra at the Ohio State 
University in the autumn quarter of 1946. 
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The data were collected after most of the 
students were presumed to have had time to 
complete the entire sequence. 

In addition to course grades, percentile 
scores on the Ohio State Psychological Ex- 
amination (OSPE) were included in the com- 
putations. 

For the 29 students who completed the 
entire seven-course sequence some personal 
data are given to describe this sample in 
greater detail. 

Table 1 presents the intercorrelation mat- 
rices of OSPE and mathematics course grades. 
These are presented in a manner such that 
one can easily see how the coefficients of cor- 
relation change as the sample decreases in 
size. 

The means presented in Table 1 show that 
the better students in the early courses tend 
to go on into the more advanced courses. 
The mean OSPE gradually increases, until 
the last group is reached where there is a 
sharp upward trend. The mean OSPE of the 
29 students in the advanced calculus course 
is 79.1 percentile. 

In Table 2 are presented the regression 
coefficients and coefficients of multiple cor- 
relation. Although some of the coefficients 
in the regression equation are negative, none 
of the negative coefficients is significantly dif- 
ferent from zero. 

The 29 personnel cards filled out during 
Freshman Week by the 28 men and one 
woman who took Advanced Calculus were 
examined in an attempt to discover some 
clues as to success in advanced mathematics. 
Of the 29 persons who took seven courses in 
mathematics, 12 made A or B grades in Ad- 
vanced Calculus, 17 made C, D or E grades. 
Chi-square was used to test for independence 
of the grade classification and the following 
classifications: 


1. Like mathematics—like some other sub- 


ject (y* = 0.000). 
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Table 1 


Coefticients of Correlation, Means, and Standard Deviations 


Note: OSPE scores are percentiles. Grades are expressed on the basis of A = 4, B = 3, C = 2, D = 1. 


Mathematics Courses 
OSPE 2 42 423 441 442 


Xy ) X6 


Math 421 


Math 422 


Math 423 


Math 441 


Math 442 


Math 443 


Mean 


Standard 
Deviation 
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Table 2 


Regression Coefficients and Coefficients of Multiple Correlation 


Mathe 
matics 
Courses N Xi 

421 .0144** 

422 978 ys; .0040** 

423 693 y4 0025 a" 4 
441 536 vs 0017 a 6.2 
442 410 Ye — 004 07 Bt 
443 326 yr — 0005 04 03 
O01 29 ys 0116 —.28 —.06 


OSPE 


* Significantly different from zero (5% level). 
** Significantly different from zero (1% level). 


. Dislike mathematics—dislike some other 
subject (x? = 0.000). 

. One or both parents dead—both parents 
living (,? = 0.008). 

. Family fewer than 4 children—family 
of four or more children (y? = 1.543). 

. Went to college the year following H. S. 
—additional time lapse between H. S. 
and college (y’ = 4.138, significant at 
5% level). 

. Live alone—do not live alone (,’ = 
5.250, significant at 5% level). 


The significant values of y* tend to indicate 
that high grades in Advanced Calculus are 
associated with going to college immediately 
after high school graduation and with room- 
ing alone. 

At the time of this writing, 25 of the 29 
students had been graduated with a mean 


423 441 


X4 Xs 


Mathematics 


Coefficient 
of Multiple 

Correlation 
1.26 ata 
70 Lia 
10 “e"* 
02 6" 

- 00 4** 
an oi** 


O05 Baas 


Constant 


cumulative point hour ratio of 2.89 (A = 4). 
The correlation between OSPE percentiles 
and cumulative point hour ratio at graduation 
is .20, but the correlation between cumulative 
point hour ratio and grades in Advanced Cal- 
culus alone is .63. 


Summary 


1. This study reports coefficients of cor- 
relation, means and standard deviations of 
mathematics course grades and Ohio State 
Psychological Examination percentiles. 

2. Regression equations for predicting suc- 
cess in mathematics courses are presented. 

3. Some personal data from students’ of- 
ficial records are discussed briefly. 


Received January 19, 1953. 
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and 
William C. Cottle 
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Interest in the measurement of readability 
has been growing steadily since the first basic 
research was done by Vogel and Washburne 
(12) in 1928. Klare (8) in 1950 estimated 
that 34 formulas or methods for determining 
the reading diiiiculty of printed material had 
been devised. Five of the more recently de- 
veloped formulas in widespread use have been 
singled out for critical analyses here: the 
Dale-Chall, Flesch, Lorge, Lewerenz, and 
Yoakam formulas. 

Little has been done to ascertain the read- 
ing level necessary to understand the content 
of standardized testing materials. Johnson 
and Bond (7) have written one of the few 
articles on this specific topic. In their paper 


the Flesch formula was used for testing read- 
ing ease of nine standardized tests in common 


use in V. A. Advisement Centers. The gen- 
eral conclusion was that many tests are being 
administered to people who do not under- 
stand them because the readability of the 
tests is too difficult. 

Stefflre (10) made a study of the relative 
reading difficulty of six interest inventories 
using the Flesch formula. High correlation 
between the Flesch formula and other for- 
mulas was reported. Roeber (9) compared 
seven interest inventories as to word usage. 
The percentage of occurrence of different 
words appearing in the inventories was com- 
puted. He found a large number of words 
beyond the understanding of ninth graders. 
Thus, his recommendation for a glossary of 
terms does appear in a later form of one of 
the inventories. 

Testing instruments are becoming so varied 
and numerous that persons who use them 

* Abstract of Forbes’ Ed.D. dissertation done at 


the University of Kansas under the direction of 
Cottle. 


need every help possible to determine the use- 
fulness of the instruments for particular pop- 
ulations. 

This study was carried out in order to de- 
termine objectively the reading difficulty of 
standardized tests commonly used in counsel- 
ing and to develop a new and _ simplified 
method for determining the reading level of 
these standardized tests. It is believed that 
this simplified readability method will also 
be found useful in measuring the readability 
of public opinion polling questions and of 
headlines and slogans in advertising copy. 


Method 


Five of the more popular techniques for 
evaluating the reading difficulty of printed 
matter were critically analyzed in relation to 
standardized tests. The Dale-Chall, Flesch, 
Lorge, Lewerenz, and Yoakam formulas were 
applied to 27 selected standardized tests com- 
monly used for counseling at various educa- 
tional levels. The mean score of reading dif- 
ficulty was then obtained. 

The choice of the tests to be used in this 
study was determined irom previous studies 
made upon test preference and from the 
newer tests indicated by the records of the 
University of Kansas Guidance Bureau. 

Berkshire and others (2) have tabulated 
responses that were received from 290 test- 
ing centers. They concluded that there is 
general agreement on approximately 15 to 
20 tests as being common to guidance test- 
ing. Beyond this point test preference varies 
widely. Tests were chosen from this list for 
analysis if they were reported by at least 25 
of the reporting centers as being one of the 
most commonly used tests. This same study 
shows in tabular form the results obtained 
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Table 1 


Comparative Grade Placement of Selected Standardized Tests According to Various Readability 
Formulas and the Application of the New Forbes Formula to Items 


and Instructions for these Tests 


Dale- 
Test Chall Flesch Lorge 


Forbes 


Instruc- 
tions 


Av. of -—~ 
Five 
Formulas 


Yoakam Items 





MMPI 

School Inventory 
Calif. Test Pers. 
AGCT 
Guilford-Zimmerman 
Otis Q-S 
Adjustment Inv. 
Minn. Pers. Scale 
Mooney 
Bernreuter 
CTMM 

Stanford Ach. 
Kuder CM 

Otis Employ. 
Henmon-Nelson 
Iowa Silent 
Lee-Thorpe 
Kuder BB 

SRA Reading 
Cleeton 

Strong Voc. int. 
Coop. Reading 
Minn. Reading 
Ohio State Psy. 
ACE 

Coop. Gen, Cult. 
Study of Values 


~ 
5.6 
6.2 
5.9 
6.0 
6.1 
64 
6.5 
6.1 
6.7 
yf 
7.0 
7.3 
6.3 
6.6 
8.0 
8.0 
7.6 
8.3 
8.4 
8.9 
8.7 
9.0 
10.7 
8.5 
8.5 
9.1 


6.1 
6.3 
7.2 
11.0 
75 
7.1 


4.4 
5.0 
5.3 
5.7 
5.6 
5.8 
6.1 
6.2 
6.0 
6.7 
8.8 
7.0 
ie 
6.3 
6.1 
7.9 
7.9 
7.6 
8.5 
7.4 
6.7 
9.9 
9.4 
9.6 
8.5 
10.7 
9.6 


8.3 
8.4 


8.4 
8.5 
79 
9.2 
11.4 
10.0 
9.2 
13.2 
14.4 
15.8 
14.0 
13.2 
16.5* 
16.1 
15.6 
16.1 


* Estimate of the grade, the formulas did not indicate grades at these levels. 


from three other studies by Brophy and Long 
(3), Darley and Marquis (6), and Baker and 
Peatman (1). The findings from these three 
studies were comparable to those of Berk- 
shire and others. 

Standardized testing instruments that have 
become popular since the appearance of the 
above articles were checked for the frequency 
of their use at the University of Kansas Guid- 
ance Bureau. Nine tests were added to the 
original list to be analyzed. Six of these were 
published after the above cited studies on 
preference had been made. Two of the re- 
maining three that were added were reading 
tests, because of the nature of the present 
study. The one remaining test, the Minne- 


6.5 
5.0 
7.1 
6.2 
7.4 
7.6 
7.8 
8.8 
8.9 
9.1 
9.1 
10.6 
9.7 
9.1 
9.8 
9.3 
10.3 
10.7 
9.8 
12.5 
10.2 
10.4 
11.9 
11.5 
12.7 


6.2 4.8 
3.3 
6.6 
4.0 
8.2 7.3 
8.2 Ie 
7.7 8.0 
6.5 10.8 
8.7 11.0 
7.3 11.9 
8.0 6.7 
6.4 14.5 
7.8 12.5 
9.3 14.3 
11.3 12.6 
91 10.1 
7.8 13.6 
8.5 14.5 
9.9 12.6 
8.3 16.0* 
12.0 13.3 
10.4 14.0 
9.7 16.0* 
9.0 11.8 
94 16.0* 
10.3 16.0* 
12.7 16.0* 


5.4 
5:5 
6.2 
69 
6.9 
7.0 
re 
7.8 
8.1 
8.2 
8.3 
8.7 


7.3 
7.2 
Wf 6.3 
6.1 
8.3 
6.4 
6.1 
8.2 
7.2 
7.0 
6.3 
6.7 
8.3 
6.4 
5.0 
6.7 


11.7 
12.2 
12.7 


sota Personality Scale (Men) 1941, was 
added at the discretion of the writers. The 
tests were chosen from five of the general 
areas of testing listed by the Third Mental 
Measurements Yearbook (4): Character and 
Personality, Intelligence (group), Interests, 
Achievement Batteries, and Reading. They 
are listed in Table 1. 

The five formulas selected for study are 
the more recently developed techniques for 
measuring readability. They present several 
factors which have been used for determining 
reading difficulty of printed matter, namely, 
word difficulty, prepositional phrases, sen- 
tence length, number of syllables per one 
hundred words, number of different words 
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and percentage of words beginning with cer- 
tain letters. Each of the formulas has been 
carefully developed and exhibits a fair de- 
gree of reliability and validity. In the Yoa- 
kam formula, “hard words” vary in difficulty 
according to their frequency and range of 
occurrence above the most common four 
thousand words. The Lewerenz formula is 
based solely on word difficulty, basing the 
vocabulary difficulty on words with certain 
initial letters. The Flesch formula considers 
the length of the word the index of difficulty 
of that word, the more syllables a word has 
the more difficult it is. The Dale-Chall for- 
mula uses a list of three thousand words, any 
word not appearing on this list is considered 
difficult. The Lorge formula considers as a 
“hard word” any word other than the 769 
words that are common to the first one thou- 
sand most frequent English words on the 
Thorndike list and the first thousand most 
frequent words known by children entering 
the first grade. 

On the basis of the facts mentioned above, 
it would seem that the Lorge formula inter- 
pretation of “hard words” is too simple and 
limited for the purpose outlined here; the 
Dale-Chall method approaches a more real- 
istic and practical definition of difficult 
words; and the Yoakam formula is perhaps 
the most realistic of all the formulas for use 
with testing instruments. The idea that dif- 
ficult and easy words begin with certain let- 
ters, presented in the Lewerenz formula, does 
not seem to apply to standardized tests; and 
the number of syllables that a word has, as 
proposed by Flesch, does not necessarily give 
its index of difficulty. 
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The grade level scores obtained for each 
test from the five formulas were averaged in 
order to obtain a mean grade level reading 
difficulty score for each test. These mean 
scores were taken as criterion grade level 
scores of reading difficulty for these selected 
tests. They are shown in Table 1. 

A definite difference was noted in the re- 
sults of the measurement of the various tests 
by these five formulas as shown in Table 1. 
There was as much as 8.13 grades difference 
in the reading difficulty of a single test as 
determined by two different formulas. 

At the same time the five formulas corre- 
lated significantly with each other. The rank 
order correlations ranged from .91 between 
the Dale-Chall and Flesch formulas to .59 
between the Lewerenz and Yoakam formulas. 
These intercorrelations are shown in Table 2. 

The rank order correlations between each 
of the formulas and the mean grade level 
score ranged from .95 for the Dale-Chall for- 
mula to .77 for the Lewerenz formula as 
shown in Table 2. 

Correlations were also computed by means 
of the ratio of the estimated true variance to 
the observed variance between each formula 
and the means of the five formulas (5). 
These correlations as shown in Table 3 ranged 
from .90 between the Dale-Chall formula and 
the mean of the five formulas to .84 between 
the Flesch formula and the mean of the five 
formulas. However, the scores obtained from 
this study for the reading level of each test 
correlated slightly over .95 with the mean of 
the five formulas and ranged between .95 and 
.72 for the five formulas as shown in Table 3. 


Table 2 


Intercorrelations (Rho) for the Five Formulas Applied to Twenty-seven Tests and Correlation 
(Rho) Between Each Formula and Mean of the Five Formulas 


Dale 

Chall Flesch 
Dale-Chall 91 
Flesch 

Lorge 

Lewerenz 

Yoakam 

Mean of Five 


Formula 


Mean of 


Lewerenz Yoakam Five 


Lorge 


90 65 75 95 
381 .66 66 90 
89 

59 77 

89 
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Table 3 


Correlations Between Forbes’ Formula, the Five Formulas Applied to Twenty-seven Tests 
and the Mean of the Five Formulas Using Ratio of Estimated 
True Variance to Observed Variance (5) 





Dale- 
Chall 


Flesch 
95 


83 


Mean of Five .90 84 


The knowledge of grammar needed to ap- 
ply most of these five formulas studied is con- 
siderable. Also, the amount of time required 
by these methods makes them quite laborious. 
More than ten hours were required to apply 
some formulas to a single test. The average 
amount of time for the working of a single 
formula on a single test was more than two 
and one half hours. The simplified Forbes 
method, in contrast, requires only approxi- 
mately one half hour per test. 

Word difficulty was used as a common fac- 
tor in all five formulas studied. It was also 
evident from a review of the literature that 
word difficulty was basic to the readability of 
all printed matter. 


Development of the Forbes Method 


The following steps were taken in devel- 
oping the Forbes method which is specifi- 
cally suited for measuring the readability of 
printed matter in standardized tests: 

1. The five formulas studied were applied 
to each of the twenty-seven standardized 
tests. 

2. The mean grade level score of the five 
formulas for each test was taken as the cri- 
terion of readability for the tests. 

3. The vocabulary difficulty was deter- 
mined for each test by finding the number 
of words above the most frequently used 
4,000 words in three samples of 100 words 
each selected at the beginning, middle, and 
end of each test. 

4. The Thorndike Junior Century Diction- 
ary was used for finding the weights to be 
assigned to each word above the most com- 
monly used four thousand. The number fol- 
lowing the definitions in this dictionary is 
the weight for that word. The weights range 
from one to twenty, but since the first four 


Mean of 
Lorge Lewerenz Yoakam Five 


84 72 .90 .96 
87 .86 87 


thousand were dropped, only numbers of four 
and above were used. 

5. The total of these weights for each test 
was divided by the number of words in the 
samples, giving the index of vocabulary dif- 
ficulty. 

6. The standardized tests studied were 
placed in rank order as determined by the 
mean grade level scores of the five formulas. 
These tests were set off into grade groups. 
All tests falling within one half grade level 
above or below the grade were considered 
with that grade group. For example, grade 
level scores of 7.5 to 8.5 would be considered 
characteristic of the eighth grade reading dif- 
ficulty. The largest and smallest indices of 
vocabulary difficulty falling within any one 
grade group were considered the limits for 
that grade. Table 4 gives the indices of vo- 
cabulary difficulty, setting the limits for the 
various grade levels. 

7. The grade level scores derived from this 
method give the average reading grade level 
required for the person taking the test in 


Table 4 


Grade Level of Reading Difficulty as Determined by 
the Index of Vocabulary Difficulty 








Index of Vocabulary 
Difficulty 


Grade 
Level 





1.4510 and above 
1.2510--1.4509 
1.0510-1.2509 
.8510- 1.0509 
.6510— .8509 
4510- .6509 
.2510— .4509 
.0510-— .2509 
.0509 and below 


College 

12th grade 
11th grade 
10th grade 
9th grade 
8th grade 
7th grade 
6th grade 
5th grade 
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order that the test be understood and done 
properly. 

The exact method of applying the new for- 
mula is listed as follows: 

1. Three samples of 100 words each were 
taken from the tests to be analyzed. The 
samples were selected in each test at the be- 
ginning, middle, and end. The only require- 
ments for the samples were that they consist 
of an even hundred words, that each sample 
begin with the first word of an item, and the 
vocabulary tests be omitted from the samples. 
It seemed only fair to omit the vocabulary 
sections in order to get the average reading 
difficulty of the standardized tests. 

It seemed easiest to begin with the first 
word of the first item of a test and count the 
first hundred word sample exactly. The mid- 
dle sample was selected as near the midpoint 
of the test as possible. Starting with the mid- 
dle item count backward to the initial word 
of an item close to fifty words back. The re- 
mainder of the middle one hundred word sam- 
ple was secured by counting the difference 
from one hundred in words beyond this mid- 
dle item. The third sample was taken by 
counting backwards from the last word of 
the test items until one hundred words were 
counted. Should the one hundred words end 
within the item, proceed counting backwards 
until the first word of an item is reached, then 
in order to get exactly the one hundred words 
omit the number over one hundred at the end 
of the sample. 

2. Each word that appeared difficult to the 
grader was written on a sheet of paper. These 
words were then found in the 1942 Thorndike 
Junior Century Dictionary (11). The num- 
ber following the definition in this dictionary 
is the weight for that word. These numbers 
range from one to twenty, representing the 
first twenty successive thousands of words 
most commonly used in the English language. 
Only words above the most frequently used 
four thousand words were given a weight. 
Any word having a weight of four or above 
was considered a difficult word and its weight 
was listed. Words used more than once in 
the samples were given their weights each 
time they were used. 

3. The weights for the three samples were 
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totaled and divided by the number of words 
in the samples, 300 in this case. This gave 
the index of vocabulary difficulty for the 
standardized tests. 

4. Using the indices of vocabulary diffi- 
culty obtained from the above three steps, 
refer to Table 4 in order to determine grade 
level of difficulty of the printed matter in the 
test being analyzed. Grade level scores may 
be interpolated to the nearest tenth of a 
grade. 

5. The reading difficulty was also figured 
for the instructions for each of the tests ana- 
lyzed. The samples in some cases included 
all directions to the tests when they consisted 
of 300 words or less; other samplings fol- 
lowed the procedure outlined above for the 
test, that is, taking 100 word samples at three 
points throughout the instructions. 

There is little room for decisions to be 
made by the scorer who uses the Forbes 
method since the words are weighted in ac- 
cordance with an accepted word list. If a 


variant of a word or a hyphenated word does 
not appear in this list, no weight is given. 
Only words that appear in the Thorndike 
Junior Century Dictionary are given weights. 


Summary 


1. Review of the literature showed that no 
specific method has been developed for find- 
ing the reading difficulty of standardized tests 
(or public opinion polling questions or head- 
lines and slogans in advertisements) up to 
the present time. 

2. The five techniques for measuring the 
readability of printed matter that were ap- 
plied to the 27 standardized tests in this 
study showed wide variation as to the grade 
placement of the reading difficulty of these 
tests. 

3. The usual methods in use for determin- 
ing the readability of reading material con- 
sume a great amount of time for their appli- 
cation. 

4. These methods also required much inter- 
pretation and judgment on the part of the 
user, thus greatly lessening their objectivity. 

5. The peculiar make-up of the reading 
matter in standardized tests required that only 
the vocabulary difficulty factor be used for 
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determining their readability. The use of 
such factors as sentence length and preposi- 
tional phrases was not practical since many 
of the tests have sections composed only of 
word lists. 

6. The instructions to the standardized 
tests were easily within the range of reading 
difficulty of those for whom the tests were 
designed. 

7. The use of short word lists for determin- 
ing difficult words tended to give too coarse a 
classification of grade levels of reading. A 
longer list made the method for determining 
the readability of standardized tests more 
sensitive, spreading the grade level scores over 
a longer range. 

8. The method developed in this study was 
based entirely upon reading matter found in 
commonly used standardized tests. It is a 
technique applicable only to such reading 
matter or to similar material. 

9. The method evolved in this study is 
easily applied, consumes little time, and shows 
high objectivity by the elimination of most of 
the interpretations and judgments formerly 
left to the scorer. 


Received August 4, 1952. 
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Dexterity Test 


Edwin A. Fleishman 


USAF Air Training Command, Human Resources Resetrch Center * 


The O'Connor Finger Dexterity Test (8) 
has been widely used in counseling and selec- 
tion.' It appears to measure dexterity of a 
finer type than is measured by the Minnesota 
Rate of Manipulation Tests (11, 9), or by 
most of the subtests of the Purdue Pegboard 
(12). The test seems most useful for manual 
jobs requiring rapid wrist and finger move- 
ments, in fine assembly work requiring both 
speed and precision, and in jobs involving 
rapid manipulation of small objects. The 
validity of the test has on occasion been dem- 
onstrated for electrical fixture assemblers (14, 
16), radio assemblers (16), power-sewing ma- 
chine operators (10), watch assemblers (1, 
3), punch press operators (7), can packers 
(13), and dental students (5, 6). 

Despite its widespread use, the test has two 
primary difficulties as a selection device. 
First, relative to other dexterity tests (e.g., 
Minnesota Rate of Manipulation, Purdue Peg- 
board), the test takes considerably longer to 
administer. The time required generally var- 
ies from 8 to 15 minutes. Moreover, during 
this more lengthy time period, the test yields 
only one score, whereas in a considerably 
shorter administration time the Purdue Peg- 
board yields five scores (right, left, both 
hands, total of these, assembly), and the 
Minnesota Rate of Manipulation Test yields 
at least two scores (placing and turning). 

A second limitation of the O'Connor test 
for selection purposes is that it is a work limit 
test. The examinee’s score is the total num- 
ber of seconds it takes him to fill the board. 


* Perceptual and Motor Skills Research Labora- 
tory, Lackland Air Force Base, San Antonio, Texas 
The data reported in this study were collected as 
part of the United States Air Force Human Resources 
Research and Development Program. The opinions 
or conclusions contained in this report are those of 
the author. They are not to be construed as reflect 
ing the views or indorsement of the Department of 
the Air Force. 

' This paper is not concerned with the O’Connor 
Tweezer Dexterity Test, which has also received con- 
siderable study. 
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This procedure makes it difficult for a single 
examiner to administer the test to more than 
one subject at a time. 

The present study investigated the feasi- 
bility of certain modified administration pro- 
cedures which would decrease the total time 
required to give the test and which would ren- 
der the test more suitable for group admin- 
istration.” 

Procedure 

The O'Connor Finger Dexterity Test was 
administered under time limit conditions to 
unselected samples of basic airmen at Lack- 
land Air Force Base. The mean age of the 
subjects was 18.9 with a standard deviation 
of 1.3. The test was administered to inde- 
pendent samples of 100 subjects each. One 
group received the test for a four-minute time 
limit condition, another group for a five min- 
ute period, and a third group for a six-minute 
period. Within each sample the tests were re- 
administered for test-retest reliabilities. The 
interval between test and retest was held con- 
stant at one and one-half hours for each group, 
since it was assumed that the length of the 
interval might affect the magnitude of the re- 
liability coefficients. Under these time limit 
conditions, a subject’s score was the total 
number of pins placed during the allotted 
time.” 

In another sample, 100 subjects were tested 
and retested one and one-half hours later un- 
der the standard work limit conditions in 
which the total number of seconds required 


* There appears to be very little published evidence 
indicating whether or not time limit and work limit 
methods of administering speed tests are equivalent 
and interchangeable. In one of the few previous 
studies on this problem, Paterson and Tinker (lla), 
working with speed of reading tests, found the work 
limit method to agree with the time limit method as 
closely as each method agreed with itself 

* Since 3 pins are placed in each hole the subject's 
score is three times the number of holes filled up to 
the last hole plus the number of pins in the last hole 
There are 100 holes in the total board 
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Table 1 


Means, Standard Deviations, and Reliabilities for Five Administration Conditions! 








Testing Method 


Work limit, full board 
Work limit, half board 


Time limit—6 minutes 
(360 seconds) 

Time limit—5 minutes 
(300 seconds) 

Time limit—4 minutes 
(240 seconds) 


Number of pins 
Number of pins 
Number of pins 


Test 


Reliability 





86 
82 


80 
.76 


71 


! Data for the Work Limit tests are based on the same sample of 100 Airmen. Data for each Time Limit 


test are based on separate samples of 100 each. 


to fill the entire board was recorded.‘ Also 
recorded during these administrations was the 
time required to fill half the board. In an 
additional sample, 100 subjects were given the 
test under work limit conditions and were re- 
tested one and one-half hours later under the 
five minute time limit condition. 

In all, 500 subjects were involved in the 
study. Independent groups were used in each 
phase in order to duplicate the standard test- 
ing conditions. In this way scores and re- 
liability coefficients derived from each ad- 
ministration procedure could be more readily 
compared, uncomplicated by differential prac- 
tice effects from other forms of the test. 


Results 


Table 1 presents the means, standard devi- 
ations and test-retest reliabilities for the vari- 
ous administration procedures. 

It should be noted that these reliability co- 
efficients are to be regarded as conservative 
relative to split-half or immediate retest relia- 
bility estimates often reported. For example, 
Darley (4) has reported a corrected split-half 
reliability of .90 and Blum (1) reported a 
test-retest reliability of .89 for the standard 
test with a half-hour interval between admin- 
istrations. This latter reliability compares 


4 The original method of scoring the test involved 
a small correction in the second half of the test for 
practice on the first half. However, Tiffin and 
Greenley (16) found a correlation of .99 between 
the total time score and scores obtained by the origi- 
nal formula. More recent studies, including those of 
the USES, have used the simpler total time score. 


favorably with our test-retest reliability of .86 
following a longer (one and one-half hours) 
interval. He also reports a reliability of .82 
for the half length test which is identical with 
our results. All these coefficients are higher 
than the original test-retest reliability of .60 
reported by Hines and O'Connor (8) in their 
original standardization of the test. 

The correlations obtained between the time 
limit procedure (five minutes), and the full 
board and half board work limit procedures 
were .96 and .89, respectively, after correc- 
tion for attenuation. This gives some indica- 
tion that the abilities measured by the time 
limit and work limit forms of the test are 
the same. 

It can be seen in Table 1 that there is some 
loss in reliability when the test is adminis- 
tered as a time limit test. In order to achieve 
comparable reliability under time limit condi- 
tions, to that obtained in the full work limit 
test (.86), nine minutes testing time would 
probably be required. However, the reliabil- 
ity achieved in the six-minute trial is probably 
sufficient for group prediction purposes. If 
one is using the test as part of a larger selec- 
tion battery, one of the shorter tests probably 
has sufficient reliability for inclusion, since a 
reduction in reliability of this magnitude 
would have little effect on the composite 
validity of the battery (see 2,15). The four- 
minute test might well be used where a choice 


® Estimated from the six-minute test by the Spear 
man Brown prophecy formula. 
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Table 2 


Normative Data for Five Administration Conditions of the O?Connor Finger Dexterity Test! 


Work Limit 
(Seconds) 


Full Board Half Board 
372 213 
430 217 
442 221 
468 226 
481 233 
507 254 
526 262 
548 271 
572 278 
600 299 
632 317 
647 323 
674 346 
717 379 
746 384 


1 Norms for the Work Limit tests are based on a sample of 200 Airmen. 


are based on separate samples of 100 each. 


must be made (as is often necessary) between 
including a longer form of the test or adding 
some additional type of test which broadens 
the scope of abilities sampled by the battery 
in the time allowed. For individual predic- 
tion and guidance purposes, the standard work 
limit procedure is probably desirable. 

Table 2 summarizes some preliminary nor- 
mative data for the various administration 
procedures. 

Although these results are based on lim- 
ited samples and are to be regarded as spe- 
cific to this kind of population, they may 
serve as a suggestive guide in future use of 
the test under these conditions. It is also 
to be noted that the work limit scores pre- 
sented are generally higher (poorer perform- 
ance) than those usually reported for other 
populations. 


Summary 


The O’Connor Finger Dexterity Test was 
administered under various work limit and 
short time limit conditions. The results in- 


Raw Scores 
Conditions 


Time Limit 
(No. of pins placed) 


6 Min. 5 Min. 


4 Min. 


270 264 180 
245 219 173 
242 214 169 
236 208 165 
228 196 157 
216 185 146 
210 178 139 
204 170 135 
198 166 131 
189 157 127 
182 147 124 
178 142 122 
174 136 116 
168 119 113 
140 ms 101 


Norms for each Time Limit test 


dicate that although there is some loss in re- 
liability under the time limit conditions, the 
reliabilities are probably adequate for group 
prediction, especially if the test is to be in- 
cluded in a larger battery. Preliminary norms 
for the modified administration 
were presented. 


conditions 
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A Comparison of the Revised Allport-Vernon Scale of Values 
(1951) and the Kuder Preference Record (Personal) 


Ira Iscoe and Omer Lucier 


University of Texas 


Both the Allport Vernon Scale and the 
Kuder Preference Record (Personal) yield 
separate “trait” scores which are named and 
defined. The purpose of the research herein 
reported was to examine the communality of 
the various trait scores on the two scales. 
According to the definitions given in the re- 
spective manuals (1, 3), it would be expected 
that a high positive correlation would exist 
between: (1) the Theoretical “trait” scores 
of both instruments; (2) the Economic 
“trait” scores of the Allport and the Practical 
of the Kuder; and (3) the Political of the 
Allport and the Sociable and the Dominant 
of the Kuder. 

It would be also expected that a high nega- 
tive correlation would exist between the Aes- 
thetic of the Allport and the Theoretical of 
the Kuder. 


Subjects 


Ninety adult males, the majority of them 
University of Texas students, acted as sub- 


jects. The mean age of the group was 26 
years with an S.D. of 6.5 years. The average 
number of years of education was 14.6 with 
an §.D. of 2.4. The mean scores and S.D 
made by the experimental groups were not 
significantly different from the scores made 
by comparable groups used by Allport and 
Kuder in standardizing their tests. 


Procedure 

The tests were administered in accordance 
with the instructions in the respective manu- 
als. The Allport-Vernon was taken first, fol- 
lowed by the Kuder. If the subjects did not 
have time to complete the Kuder during the 
scheduled sessions it was taken home and re- 
turned later. Scattergrams were made for the 
numerous combinations of item scores for one 
inventory with item scores of the other in- 
ventory. Since a rectilinear relationship was 
obtained for all scattergrams the use of the 
Pearson product-moment formula for correla- 
tion was justified. The evaluation of the data 


Table 1 


Correlations Between Each Score of the Kuder Preference Record (Personal 
and Each Score of the Allport-Vernon Scale of Values, 1951 


Sociable 
r* = 86 


Allport “i 


Theoretical 
Economic 
Aesthetic 
Social 
Political 
Religion 


(.87) — 30 
(.92) 00 
(.90) O1 
(.77) 10 
(.90) 02 
(.90) 13 


Kuder 


Practical Theoretical 
bs “ 


r* = 86 r* = 85 
04 20 — O8 O1 

09 — .52 16 13 

— .23 A7 18 — 08 
05 Bh 10 — 08 

33 — 36 — .32 30 

27 — .O8 13 —.16 


Agreeable Dominant 
r* = .84 r* = 85 


* The score reliabilities are from the respective manuals, and are placed in parentheses immediately following 
the Allport designation and immediately underneath the Kuder designation. The Kuder manual contains relia 
bility measures computed by The Kuder-Richardson Formula. The population selected for use in Table 1 was 
that for “100 men.” The reliabilities from the manual for the Allport scores were obtained from “Test-Retest”’ 
data for 34 cases with one month intervening between the test and the retest. The manual also includes a table 
of split-half reliabilities with an N of 100. Statistical material on the revised form of this inventory is as yet 
scarce due to the recency of its publication (1951). According to the authors, “the present revision offers certain 
improvements without in any way changing the basic purpose of the test (referring to the 1931 version of the 
‘Study of Values’ as compared to the 1951 version) or limiting its scope of usefulness” (1, p. 6). 
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by means of factor analysis was not resorted 
to in view of Guilford’s (2) recent article on 
“Tpsative” factors and his remarks that scales 
such as the Kuder were not amenable to fac- 
tor analysis. A total of 30 correlations (six 
traits of the Allport and five for the Kuder) 
were computed. 
Results 


It can be seen from Table 1 that none of 
the hypotheses put forth at the beginning of 
this article were justified. The correlation of 
.20 between the two theoretical scales is sur- 
prisingly low. Indeed, the highest positive 
correlation obtained (.47) was between the 
aesthetic of the Allport and the theoretical of 
the Kuder—where the expectancy was for a 
high negative correlation. The low positive 


correlation between the “Social” of the All- 
port and the “Sociable” of the Kuder can be 
explained in that other than having similar 
names, they are defined rather differently. 


Conclusions 


The results obtained point up once again 
the dangers of using similarly defined traits 
measured by different instruments. As an ex- 
ample, one of our subjects obtained the fol- 
lowing raw scores on the “Theoretical” of 
both instruments: 


3. Kuder, G. F. 


Ira Iscoe and Omer Lucier 


Percentile 
Raw Score Rank 


Allport 48 73 
Kuder (Personal) 25 13 


Instruments 


It can be seen that on one instrument he 
would be considered of reasonably high theo- 
retical orientation while on the other he would 
be very low. Since both the Allport and the 
Kuder are used in educational and vocational 
counseling a totally different picture of this 
subject’s interest would have been furnished. 
The importance of knowing the relationships 
between the various measuring instruments is 
perhaps one way of avoiding gross errors in 
the counseling situation. One avenue of 
further research might be the use of two 
instruments on a population where certain 
traits mentioned were believed to be present 
to a high degree. 


Received July 17, 1952. 
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Administering Form BB of the Kuder Preference Record, Half 
Length 


A. A. Canfield ' 


Wayne University 


To administer Form BB of the Kuder Pref- 
erence Record? to an entire group of people 
in industry usually requires more than an 
hour although many finish in less time. Feel- 
ings of fatigue, boredom, and annoyance are 
commonly expressed by the examinees dur- 
ing the course of the test. Sighs of relief are 
regular expressions for employed adults who 
complete the record. The laborious and fre- 
quently painful task of punching the holes 
with the small pin provided, the turning of 
progressively smaller and smaller pages, and 
the apparent duplication of items from page 
to page combine to give an emotional reaction 
that is, in the main, unpleasant. Examinees 
often ask if some scoring technique is used to 
“check-up” on the consistency of their an- 
swers by comparing their answers on what 
they believe to be the same item occurring on 
different pages. 

The objections raised by the examinee are 
not always easy to turn away with good con- 
science, for the specific percentile scores ob- 
tained are normally sorted into quite broad 
and often arbitrary categories for interpreta- 
tion. The manual * accompanying the test in 
the section devoted to the interpretation of 
scores recommends that three general score 
categories be used in interpreting the results 
(high for percentiles above 75, low for per- 
centiles below 25, and average for the middle 
50%). Some test users have expanded this 
to include five groupings. It can be very em- 
barrassing to try to explain to an examinee 


1 The author wishes to express his thanks to the 
firm of George Fry and Associates in Chicago, Illi- 
nois for making these data available for the research 
and for providing facilities and supplies for the proc- 
essing of the data, and to Mr. Wesley Potter of 
Northwestern University for his conscientious appli- 
cation to the laborious task of scoring most of the 
papers and preparing frequency distributions. 

2Kuder, G. F. Kuder Preference Record, Form 
BB. Chicago: Science Research Associates, 1942. 

3 Kuder, G. F. Revised Manual for the Kuder 
Preference Record. Chicago: Science Research As- 

sociates, 1946. 
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the need for the length of the test, if one ad- 
mits to this type of interpretation, an inter- 
pretation which many users of the test make. 
In addition to these broad classifications, the 
manual contains sample profiles for 51 occu- 
pations. The manual cites the desirability of 
collecting more data for the purpose of pre- 
paring occupational profiles of greater relia- 
bility, but does not recommend their use in 
counseling or guidance work. 

In many cases a measure of interests, such 
as this test provides, would be a useful ad- 
junct to the information supplied by other 
tests but the testing time required makes it 
impractical. Miles * has recognized the prob- 
lem and suggested using the scores obtained 
on pages 7, 8, and 9 of the record and then 
multiplying them by constants to predict the 
total score on each of the nine interest areas. 
Using this method on a sample of 205 adult 
males, correlations were obtained between the 
predicted and the actual scores ranging from 
.76 on Part III (Scientific) to .91 on Parts 
II and VI (Computational and Literary). 

The present study was undertaken because 
it was considered desirable to elaborate this 
ratio approach by making a correlational 
analysis and developing regression equations 
for a more accurate prediction of the total 
score. An examination of the answer sheet 
showed that the division of pages in the book- 
let that comes the closest to giving an even 
division of item responses for the nine interest 
areas was that of the odd-numbered pages 
versus the even-numbered pages. Since this 
division also supplied an odd-even reliability 
grouping, it was decided to undertake the 
study using this page division. 


Method 


A total of 301 completed records, repre- 
senting a substantial proportion of the per- 
4Miles, R. W. A proposed short form of the 


Kuder Preference Record. J. appl. Psychol., 1948, 
32, 282-285. 
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Table 1 
Correlations and Regression Equations Obtained in the First Sample 
N = 301 


Odd Pages 


Interest Area Regression Eq. 


1.38x + 20.18 
1.85x + 3.30 
1.82x + 3.95 
1.81x + 7.00 
1.65x + 1.02 
2.01x + 8.28 
1.83x + 2.20 
1.80x + 5.36 
2.10x + 6.29 


1. Mechanical 
2. Computational 
3. Scientific 
4. Persuasive 
5. Artistic 
. Literary 
7. Musical 
. Social Service 
. Clerical 


sons employed in supervisory, staff and ad- 
visory, and skilled line positions by a large 
midwestern canning company, were pulled 
from the files. Each paper was then scored 
by interest area, the score on each area being 
divided into that achieved on the odd-num- 
bered pages and that obtained on the even- 
numbered pages of the booklet. Inasmuch 
as these scores, as well as the totals in some 
cases, were noticeably skewed, all of the dis- 
tributions were normalized by the percentage 
method. Correlations were then computed be- 
tween each of these scores and the other two 
for each interest area. The computation of 
the mean and standard deviation of each dis- 
tribution supplied the additional data neces- 
sary to develop the regression equations de- 
sired for predicting the total scores from the 
scores on either of these two halves. 

To check upon the accuracy of these equa- 
tions the completed records of a second sam- 
ple of 100 employed males, drawn alpha- 
betically from the files, were scored in the 
same manner. The score in each interest area 
was broken down into that obtained on the 
odd-numbered pages and that achieved on 
the even-numbered pages. None of the papers 
used in the first sample were included in this 
second group. Correlations were then com- 
puted between the predicted scores and the 
obtained scores for each of the nine interest 
areas, and the standard errors of estimate 
obtained. As a check upon the representa- 


Odd-Even Even Pages 


Tos Regression Eq. 


92 d ‘ 1.91x+ 9.10 
77 , j 1.68x + 4.04 
78 é BL 1.68x + 11.35 
34 P d 1.83x + 7.97 
78 . 5 1.99x + 5.93 
81 ; : 1.60x + 3.31 
.78 d : 1.74x + 1.63 
.76 ‘ i 1.67x + 13.97 

1.38x + 9.04 


tiveness of the two samples used in this re- 
search, the means and standard deviations of 
the total scores in each of the interest areas 
were also computed. 


Results 


The original correlations between the scores 
on the even-numbered pages and the total 
scores, and the resulting regression equations 
are presented in the first two columns of 
Table 1. The correlations between the scores 
on the odd-numbered and even-numbered 
pages for each of the nine interest areas are 
shown in the third column. Inasmuch as they 
represent odd-even reliability figures, the 
corrected values, using the Spearman-Brown 
prophecy formula, are shown in the adjacent 
column. These values are almost identical 
with those given in the manual for reliabili- 
ties computed using the Kuder-Richardson 
method. The two right hand columns of Table 
1 show the correlations between the scores on 
the even pages and the total scores, and the 
resulting regression equations. These regres- 
sion equations were then used for predicting 
the total scores as previously described. 

The correlations obtained between the pre- 
dicted total scores, based on the odd-page 
scores, and the obtained total scores for the 
second sample, along with the mean errors of 
prediction and the standard errors of estimate 
are shown in the first three columns of Table 
2. The same information for the predictions 
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Table 2 


Correlations Between Predicted and Obtained Scores, Mean Errors, and Standard Errors 
of Estimate Obtained in the Second Sample 


Odd Pages 


Interest Area 


1. Mechanical 

. Computational 

. Scientific 
Persuasive 
Artistic 

. Literary 

. Musical 

. Social Service 

. Clerical 


based on even-page scores are given in the last 
three columns of Table 2. 

It will be noted that the correlations be- 
tween the predicted total scores and the ob- 
tained total scores range from .90 to .97. 
The standard errors of measurement are small, 
considering the percentile equivalents, and the 
mean errors similarly small.° 

Table 3 shows the means and _ standard 
deviations of the two groups used in this 
study and the means and standard deviations 

5 Conversion tables have been prepared which 
make it possible to translate the part-score directly 
into the percentile score in each of the nine interest 
areas. A copy of these conversion tables can be se- 
cured from the Department of Personnel Methods, 
School of Business, Wayne University, Detroit 1, 
Michigan at no cost 


N = 100 


Even Pages 

S.E cst y’ Moesrer 
4.94 21 
4.06 10 
5.51 ; — 48 
5.39 - 40 
4.27 . 44 
4.96 65 
3.16 -.22 
6.18 32 
5.39 


of the norm group reported in the manual. 
The means and standard deviations of the 
three groups are generally similar, with the 
exception of the generally higher interests of 
the experimental groups in the persuasive 
area. 


Summary 


This study was designed to determine the 
plausibility of administering Form BB of the 
Kuder Preference Record half length. An 
analysis of the test answer sheet indicated 
that the odd pages and the even pages of the 
test contained a fairly even distribution of 
items in each of the nine interest areas meas- 
ured by the test. 


Table 3 


Means and Standard Deviations of the Two Sample Groups and the Test Norm Group 


Prelim. Study 
(N = 301) 

Interest Area M S.D 
. Mechanical 18.8 
. Computational 11.4 
. Scientific 14.8 
. Persuasive 20.9 
5. Artistic ; 12.9 
. Literary 14.3 
. Musical 7.8 
. Social Service 16.8 
. Clerical 13.3 


Verification Study 


(N = 100) 


Norm Group 
(N = 2667) 


M S20. S.D 


74.3 20.3 22.8 
11.7 10.6 

14.5 15.5 

21.0 : 20.6 

14.0 13.6 

12.7 15.1 

7.2 9.6 

17.5 

13.5 
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The completed answer sheets of 301 em- 
ployed males, representing a variety of jobs, 
were analyzed and mathematical equations 
for predicting total scores from the scores on 
the two different sets of pages were developed. 
A second sample of 100 employed males was 
used to test the accuracy of predictions using 
these equations. 

The results indicate that the test could be 
administered half length with little loss of 
accuracy under normal conditions of test in- 
terpretation. This reduction in administra- 


A. A. Canfield 


tion time means an appreciable reduction in 
testing costs, greatly lessened feelings of fa- 
tigue, boredom, and irritation for the exami- 
nee, greater possibilities for using the test in 
industrial situations where its administration 
time has previously been considered prohibi- 
tive, and an opportunity to use the time 
saved for the administration of other tests 
that might contribute to the prediction of 
job success. 


Received June 30, 1952. 
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Attitudes Toward Public Low-Rent Housing, Before and After 
Construction * 


Kenneth E. Clark 


University of Minnesota 


and 


Charles E. Swanson 


Institute of Communications Research, University of Illinois 


Two years prior to the collection of the 
data reported herein, there was announced 
in a midwestern city a plan for the erection 
of a public housing project, using federal 
funds, to provide living facilities for persons 
of low income. Since this project was to be 
built in an area fairly well surrounded by ex- 
isting housing, it was considered that the sur- 
vey of attitudes of persons in the neighbor- 
hood both before and after construction of 
the development would provide significant 
information on the dynamics of attitude 
changes. The results of the original survey 
before construction have already been re- 
ported in this journal.’ 


The two surveys were made under fairly 


comparable conditions. The first was made 
in June 1950, shortly after announcement of 
approval of the project. The second survey 
was made just two years later, approximately 
one year after construction had started, and 
about two weeks before the first families be- 
gan to move into the project. The first 
sample was drawn as a fixed-address City 
Directory sample; of 196 units listed, inter- 
views with a responsible adult were obtained 
in 188, or 96 per cent. In the second sample 
this same list of addresses was used, supple- 
mented by an additional sample of 192 resi- 
dences. A total list of 388 addresses was ob- 
tained, of which 366 were usable (17 had 
been torn down, 4 addresses were erroneous, 
1 house was vacant). Of these 366, 351 

* The writers are indebted to Mr. Norris Ellertson 
for his work in the supervision of the interviewing 
staff used in this study, and for his work in the 
analysis of results. This study was made possible 
by the support of the Office of Naval Research, 
Project N6onr—246, T.O. IV, NR 173-348. 

1 Clark, K. E., and Swanson, C. E. Neighborhood 


reaction to public low-rent housing. J. appl. Psy- 
chol., 1951, 35, 342-347 
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households, or 96 per cent, were contacted 
and interviewed. Six householders refused 
to be interviewed (less than 2 per cent); 2 
would not answer the door; 2 were not at 
home after 2 call-backs; 2 were out of town; 
2 returns were not usable; 1 was unable to 
help because of death in the family. 

The same questionnaire used prior to con- 
struction was used after construction, with 
only minor changes (“how many stories will 

.” was changed to “how many stories does 

”). A new question was added to permit 
sorting the householders into those present 
in the community at the time of the first sur- 
vey and those not present. 


Results 

Opposition to the housing project decreased 
somewhat over the two year period. Re- 
sponses to the questions “Do you favor or 
oppose the construction of this new develop- - 
ment?” and “How strongly do you feel about 
this?” for the two years, 1950 and 1952, are 
presented in Table 1. This increase in favor 
occurs without a reduction in the number of 
no opinion responses. This is a rather sur- 
prising result, since one might expect that, 
with the project in actual physical existence, 
more persons would have formulated an opin- 
ion of some sort. The physical appearance 
of these new units in general attracted favor- 
able comment, which may account in part for 
the shift in attitude, but certainly not for 
the continued high percentage of persons re- 
fusing to state a position. 

The same question asked about income in 
1950 was asked again in 1952, except that an 
additional $500 was added to each response 
category as a rough estimate of the average 
increment in income which might have been 
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Table 1 





Opinions Toward Low Rent Housing Project in 1950 and in 1952 


Favor 


1950 1952 








No Opinion 
or Qualified 


Oppose 
1950 


1950 1952 1952 











73 1539 
39 45 


Total Group: Number 
Per Cent 
By Intensity of Feeling: 
Very strongly 
Rather strongly 33 35 
Not strongly at all 24 17 
No answer 1 2 


100% 100% 


42% 46% 


Total 


expected during this period. These two dis- 
tributions are presented in Table 2. 

That our estimate of the average increment 
was not too far off is indicated by the simi- 
larity of the two distributions. 

Table 3 shows that in 1950 a slightly larger 
proportion of persons with high incomes 


71 
38 


108 
31 


44 
23 


84 
24 


49% 53% 
39 33 
12 13 

0 1 


100% 


0% 
0 


4 
21 
74 


39 
61 


100% 100% 


100% 





tended to favor the project than did those 
with lower incomes. This was also true in 
1952. The 1952 opinions are generally more 
favorable for all three groups; the lowest in- 
come group, however, has a much larger per- 
centage of undecided respondents in 1952 
than it had in 1950. Why this should be is 


Table 2 


Reported Incomes of Respondents in 1950 and in 1952 


1950 1952 


N 


Per Cent 
1950 1952 1950 1952 





$5,000 up 
4,000-4,999 


$5,500 up 
4,500-5,499 


33 
20 


31. 
16 


108 
54 


62 
38 


3,000--3,999 

2,000-2,999 

1,000--1,999 
(0-999 


No answer 


Total 


3,500-4,499 
2,500-3,499 30 
1,500-2,499 13 

0-1,499 2 


36 71 
53 
21 


15 
7 29 


188 351 


Table 3 


Opinion on Housing According to Income Level in 1950 and in 1952 


Income Level 


1950 


N 


1950 1952 


Favor 


1952 1950 = 1952 19 





$5,000 and up 


$3,000 to 4,999 


Less than $3,000 


$5,500 and up 
$3,500 to 5,499 
Less than $3,500 


W% 47% 
39 52 
35 


62 108 
74 «(125 
45-89 


35 
43 


35% 


Oppose 
1952 


34% 
28 
27 


50 


19 20 
15 


‘ 


1 
4 


100% 100% 


Qualified or 


No Opinion 


1950 
21% 
26 
22 


20 
33 











1952 
19% 


Attitudes Toward Public Low-Rent Housing 


Table 4 





Go Up 


1950 1952 1950 1952 


Do You Think Property Values Will Go Up, Down, or Stay the Same? 


Go Down Stay Same Other 


1950 =1952 1950-1952 1950 =: 1952 





Total Group 188 351 5% 4% 
Favor Project 73.159 8 5 
Oppose Project 71 ~=—- 108 3 2 
No Opinion or Qualified 

Opinion on Project 44 84 2 2 


not clear. It would seem that these persons 
became more uncertain about the project as 
its reality increased. 

Tables 4, 5, and 6 present responses to three 


35% 28% 50% W% 10% 8% 
7 11 77 77 8 7 
16 30 7 8 


61 68 19 11 


A series of information questions was in- 
cluded in the original survey to determine the 
degree to which persons had become ac- 
quainted with specific portions of the plan 


question requiring prediction about the effects 
of the project, divided according to responses 
favoring or opposing the project. These re- 
sults are particularly interesting since they 
indicate a lessening of predictions of an un- 
pleasant sort by those who oppose the project. 
Thus, those persons who still say they oppose 
the project do so with less associated feelings 
of unpleasant consequences of the project. 


for the development. Responses in 1950 and 
1952 are compared in Table 7. Item num- 
bers refer to the following questions: 


1. “About how many families do you un- 
derstand will be housed in this develop- 
ment?” Correct answer in 1950 was, 
“120”; in 1952, “184.” : 

. “How much a month will be charged for 


Table 5 


Do You Think This Unit will Bring Undesirable People into Neighborhood? 


N Yes No Other 


1950 


1950 =: 1952 


1950 1952 1952 1952 
351 N% 228% 38% 45% 21% 27% 
159 15 11 73 67 12 22 
108 76 58 10 18 14 24 


1950 


Total Group 

Favor Project 

Oppose Project 

No Opinion or Qualified 
Opinion on Project 84 30 23 25 39 45 38 


Table 6 


Will Construction of Development Have Effect on Your Long Term Plans to Stay 
or Move Out of Neighborhood? 


N Yes Other 
1952 1950 1952 1952 


351 28% 12% 8% 1% 
159 11 1 3 3 
108 55 33 17 20 


1950 1950 


Total Group 188 
Favor Project 73 
Oppose Project 71 
No Opinion or Qualified 

Opinion on Project 44 84 11 6 ’ 15 
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Table 7 


N Question 1 


1950 1952 

188 351 
73°—«159 
71 = 108 


1950 1952 
30% 23% 
32 
32-29 


Total Group 

Favor Project 

Oppose Project 

No Opinion or Qualified 
Opinion on Project 44 


23 19 


rent, heat and utilities for one of these 
units?” Correct answer in both years 
was “about $36 per month.” 

3. “How many stories do the units have?” 
Correct answer in both years was, “two.”’ 

. “What is the most money a family can 
make a year and still rent a place in the 
development?” Correct answer in both 
years was, “not to exceed $2400 plus 
$100 per dependent.” 

. “If an undesirable family gets into this 

* development will the housing authority 
be able to get them out?’ Correct 
answer in both years was, “yes.” 


The percentages reported in Table 7 are 
the percentages of correct answers. Only 
question 1 shows a decrease in correct in- 
formation, perhaps due to the change in cor- 
rect answer from 120 families in 1950 to 184 
families in 1952. The only item on which 
opponents of the project show less informa- 
tion than proponents is question 5. Question 
3 shows a remarkable improvement in in- 
formation, although it is rather surprising 
that even though this large and prominent 
project exists in their immediate neighbor- 
hood, 21% of the residents do not know that 
the buildings are two stories in height! 


Results from Matched Respondents 


The preceding analysis is of interest in de- 
scribing the total change in sentiment in 
the community, but yields little information 
about what happens to the individual re- 
spondent as a result of watching this project 
develop from the planning to the construc- 
tion stage. Accordingly, from the sample of 
addresses used in both the 1950 and 1952 


Correct Responses to Information Questions in 1950 and 1952 


Question 2 





Question 5 


1950 1952 1950 1952 
27% 34% 24% 39% 
30-32 4752 
31.4 24 «26 


Question 3 Question 4 


1950 1952 


15% 79% 
2584 
13 84 


1950 1952 
12% 
12 


23% 
25 
13 26 


9 14 


2 64 


16 


29 27s 32 


surveys, respondents were matched, as nearly 
as possible, using information on apparent 
age, sex, and reported education. A total of 
171 of the original 188 households were in- 
cluded in the 1952 survey. Of these, 14 were 
newcomers; i.e., they reported that they were 
not living in their present house in June of 
1950. Of the remaining 157, 66 persons 
were found with matching characteristics, 
and are assumed to have been interviewed in 
1950 and 1952. Their responses for these 
two surveys are shown in Table 8. 

These results are not in accord with those 
for Table 1, where the percentage of quali- 
fied and no opinion responses remained as 
high in 1952 as in 1950, and where, appar- 
ently, gains in favor were made at the ex- 
pense of the “oppose” group. These results, 
for matched respondents, indicate that gains 
in favor are made at the expense of the quali- 
fied or no opinion groups. 

Further information on this point is ob- 
tained by comparing the responses of matched 


Table 8 


Comparison of Responses of Same Persons 
in 1950 and 1952 


Qualified 
or No 
Opinion 


1950 Total 


23 


Oppose 
1 


Favor 





Favor 

Qualified or 
No Opinion 

Oppose 


5 
15 


Total 21 
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households, rather than matched individuals. 
These results are shown in Table 9. (Note 
that the following values include the persons 
already reported in Table 8.) 

The results in Table 9 have somewhat more 
meaning than the preceding ones, since they 
are based on larger N’s, but have lost some- 
what in their significance since they represent 
matched households rather than matched in- 
dividuals. If we may overlook the latter fac- 
tor, it seems clear that the most significant 
changes in response have occurred in the 
original opposition group. This group shows 
almost as much shift in opinion as the origi- 
nal no-opinion group, and shifts almost as 
much to favor as it does to no-opinion. 


Table 9 


Comparison of Responses of Same Households 
in 1950 and 1952 


1952 


Qualified 
or No 
Oppose Opinion — Favor 


1950 Total 


Favor 9 
Qualified or 

No Opinion 
Oppose 


Total 


The total number of households shifting 
from one position to another, shown in Table 
9, is larger than one might have predicted, 
especially when the direction of shift varies 
as much as it apparently does. Is it possible 
that some of the original responses were held 
with little intensity, and so were almost the 
same as no-opinion responses? Some evidence 
on this point is available in the matched- 
person sample in Table 8 from the response 
to the question on intensity with which the 
opinion was held. One might expect the dis- 
tribution of responses in 1950 of “changers” 
to be somewhat different from that of the 
“non-changers.”’ But such is not the case. 
Although in this group the N’s are very 
small (only 8 persons with an original re- 
sponse of favor or oppose in the matched 
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sample changed their responses in 1952), the 
distributions on intensity are still so nearly 
identical as to suggest that this is not a likely 
explanation. What seems more likely is that 
the large number of “changers” occurs only 
when we interview different persons in the 
same household. This suggests “division of 
opinion” within households as well as between 
households. 

The findings of a study of this sort have 
considerable significance for those persons 
associated with planning civic programs, for, 
in working on plans, one must not only con- 
sider the opinions of one’s public at the time 
plans for change are announced, but also the 
eventual degree of acceptance or non-accept- 
ance of the completed project. In this par- 
ticular instance the effect of the actual con- 
struction of a low-rent housing project was 
to reduce, but only to a slight degree, the 
opposition of the neighboring residents to the 
project. 


Summary 


Shortly after the approval in 1950 of plans 
for a low-rent housing project in a metro- 
politan area, interviews with neighboring 
householders were conducted to determine 
their opinions about this project. Again in 
1952, shortly after completion of the con- 
struction of the project, but before the new 
occupants began to move in, interviews were 
conducted with the original sample of house- 
holds, and with an additional sample of about 
the same size. It was found that: 

1. About equal numbers favored and op- 
posed the project in 1950, with about one in 
four undecided. In 1952, about the same 
proportion continued to be undecided, but the 
proportion favoring the project had increased 
slightly (from 39 per cent to 45 per cent). 
The large number of respondents who con- 
tinue to be “undecided” in 1952 occurs in 
spite of the prominence of the project, the 
considerable publicity given to it, and the 
controversy about it. 

2. Comparison of the responses of persons 
and households who appeared in both the 
1950 and 1952 samples does not indicate very 
clearly the source of the increased response 
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in favor of the project, but does suggest that 
when changes occurred, they were more likely 
to be from oppose to undecided, or undecided 
to favor, rather than from oppose to favor. 
3. What changes did occur must be con- 
sidered to be the result of the appearance of 
the project rather than the characteristics of 
the new residents, since interviewing was com- 
pleted before the units were occupied. 


Kenneth E. Clark and Charles E. Swanson 


4. Of incidental interest is that, by means 
of several call-backs plus the use of well 
trained interviewers, the number of refusals 
to be interviewed was kept below two per 
cent in both 1950 and 1952, and the number 
of interviews completed in the two fixed- 
address samples was maintained at about 96 
per cent. 


Received August 11, 1952. 
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The research to be reported in this article 
was designed to provide some information on 
the following questions: (1) how well might 
the performance of a pair of individuals on 
a manual dexterity task be predicted from a 
knowledge of the individual manual dexterity 
scores of the two persons making up the 
group; and (2) does the relative level of 
group performance seem to be more closely 
associated with the lower or the higher of the 
individual performers. 


Experimental Procedure 


Each “group” studied in this research con- 
sisted of two men. One hundred and thirty 
volunteers were recruited about equally from 
undergraduate and graduate students to make 
up 65 groups. No attempt was made to con- 
trol the placement of individuals into groups. 
Some pairs were composed of friends, but in 
the majority of cases the individuals were 
either unknown to each other or were only 
casually acquainted. 


The subjects in each pair were brought into a 
well lighted and ventilated experimental room and 
seated across from each other at a table of ap- 
proximately office-desk proportions. The experi- 
menter was seated at the far end of this table. 
In front of each subject was a Purdue Pegboard 
Test, the two boards touching each other at the 
ends containing the peg cups. The standard in- 
structions for the Purdue Pegboard, Assembly 
Task were read to the subjects. Following this, 
six standard trials were taken, the “Assembly” 
scores being recorded after each trial for each 
subject individually. Each subject was able to 
see how well the other person was doing in com- 
parison with his own performance. 

Following the completion of six trials of indi- 
vidual performance, one of the pegboards was re- 
moved and the other was placed between the two 
subjects, the long direction of the board perpen- 
dicular to the axis through the subjects, and the 
cups away from the experimenter. The follow- 
ing instructions were read by the experimenter: 


“In the second part of the experiment, you 
will work on the same type of task except that 
you will work together rather than individu- 
ally. First (subject A—on E’s left) will pick 


up a peg and place it in the first hole of the 
row nearest you. Then (subject B—on E’s 
right) will pick up a washer and place it over 
the pin. Then (subject A) will pick up a 
collar and place it over the washer. Then 
(subject B) will pick up a washer and place it 
over the collar, completing the first assembly. 
At the same time, (subject B) will pick up a 
peg with the other hand and place it in the 
second hole of the row nearest him. Then 
(subject A) will place on a washer, (subject 
B) will put on the collar, and finally (subject 
A) will place on the final washer, picking up a 
peg at the same time with the other hand and 
placing it in the hole diagonally across from 
the assembly being completed. Thus, the as- 
semblies zigzag down the board, each person’s 
assignment alternating on each successive as- 
sembly. Now do a few assemblies for prac- 
tice.” 


When it was clear that the subjects understood 
the nature of the group task, six trials of one 
minute each were taken, scores being recorded 
for each trial. Scoring was the same as for the 
individual assembly task. This completed the ex- 
perimental session, usually taking 25 to 30 min- 
utes. In reading the instructions for the group 
task, the subject’s name was inserted in the ap- 
propriate space, italicized in the text given above. 


Treatment of the Data 


The statistical analysis of the data proceeded 
in the following steps: 

(1) For each person, individual assembly scores 
on trials three and five were added together and 
scores on trials four and six were added together. 
These “‘split-half” scores were used to obtain re- 
liability estimates and also were added to give a 
total individual performance score. The same 
procedure was followed for the “group” scores. 

(2) The members of each pair were designated 
as “high” or “low,” respectively, on the basis of 
their total individual performance scores com- 
puted in (1) above. The person of the pair 
with the higher total was automatically classified 
as “high” and his partner was classified as “low.” 
Many individuals in the “low” classification had 
higher scores than some persons in the “high” 
classification. This apparently arbitrary method 
of dividing subjects was. adopted because one 
objective of the experiment was to determine 
whether the lower or the higher of the two indi- 
vidual performers of a pair would have a greater 
influence on their group effort. Common sense 
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Table 1 


Summary of Results 


Score ] a 
High 


Low 


16.5 

16.8 

19.2 
R = .66 


Group 


might suggest that the pair could do no better 
than the poorer man, in a normative sense. 

(3) Pearson product-moment correlations were 
computed between the “split-half” scores in the 
“high” and “low” categories and also for the 
“group” performances. Thus, the 65 persons in 
the “low” category each had two half scores. 
These scores were correlated and the correlation 
corrected for doubled length by the Spearman- 
Brown prophecy formula to obtain a reliability 
estimate for total individual performance scores 
in the “low” category. The same procedure was 
followed for those persons in the “high” classifi- 
cation and for the pairs of individuals involved 
in the “group” performance. 

(4) Pearson correlations were computed be- 
tween “high” and “low” individual performances, 
between “high” individual and “group” perform- 
ances, and between “low” individual and “group” 
performances. These correlation coefficients were 
corrected for attenuation in both variables in- 
volved, using the estimates of reliability obtained 
in (3) above. The correlations so obtained were 
treated as estimates of the values which might be 
expected between the given variables had the 
measures involved been entirely free of errors of 
measurement. 

(5) A coefficient of multiple correlation be- 
tween “group” performance and the respective 
“high” and “low” individual performances was 
computed, using the coefficients of correlation 
corrected for attenuation, as obtained in (4) 
above. The multiple correlation was computed 
using correlations corrected for attenuation be- 
cause it was desired to have an estimate of the 
maximum amount of variance which could be 
predicted under ideal conditions, i.e., with errors 
of measurement absent. The idea was to gain 
some indication of how much variance might be 
attributable to certain additional unknown vari- 
ables. 

(6) Beta weights for the “high” and “low” 
scores, respectively, were computed for the re- 
gression equation to predict “group” performance 
scores. 

Results 


The results of the statistical analysis have 
been summarized in Table 1. Inspection of 


Corrected r with 


Beta 
High Weight 
1.00 52 56 35 
we 1.00 59 Al 
56 59 1.00 
R? = .44 


Low Group 


the scatter plots revealed no indication of 
curvilinear regressions, although the plot be- 
tween “high” and “low” scores had a restric- 
tion due to the fact that the “low” score of a 
pair could not be greater than the “high” 
score.’ In the first column of Table 1 are 
listed the total score categories, “high,” “low,” 
and “group,” standing, respectively, for those 
total performances as described in the previ- 
ous section. The means and standard devia- 
tions of the three sets of scores are given in 
the second and third columns, respectively. 
These are based on the totals of the last four 
of six trials. This procedure was decided 
upon in advance to obtain more stabilized re- 
sults. In the fourth column are given the 
reliability estimates as obtained by the pro- 
cedure described in (3) of the last section. 
The next three columns of Table 1 give the 
intercorrelations of the total score variables, 
corrected for attenuation. The steps were 
described in (4) of the section on treatment 
of the data. The last column contains the 
beta weights for predicting “group” perform- 
ance from “high” and “low” individual per- 
formances. The multiple correlation, R, and 
R*, as described in (5) of the last section, are 
given in the bottom row of the table. 


Discussion 


Information pertaining to the first question 
in this research is given by the multiple cor- 
relation coefficient between “high” and “low” 
scores and the “group” score. The square of 
that coefficient indicates that 44 per cent of 


1To determine the effect of this artificial restric- 
tion, 65 pairs of two-digit numbers were taken from 
a table of random numbers, placing arbitrarily the 
higher of the two numbers in the first group and the 
lower of the two in the second group. The result- 
ing Pearson correlation was .56. 
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the variance in an errorless measure of group 
performance could be predicted from a linear 
combination of perfectly reliable “high” and 
“low” individual scores. The percentage 
which can be predicted by fallible measures 
would be less. 

Three possible explanations of this result 
will be given. First, the task itself may ac- 
tually be significantly different in its nature 
from the individual assembly task. It was 
necessary for the subjects to alternate opera- 
tions on succeeding assemblies when working 
together which was not the case in individual 
operations. In future work, a redesigned in- 
dividual task will be used which requires the 
subject to interchange the sequence of hand 
movements on alternate assemblies. This 
should make the operations in the individual 
task more like those in the group task. 

Even though the previously mentioned dif- 
ference in the individual and “group” tasks 
were eliminated, this would by no means in- 
dicate that the “group” task would then be 
the same to the participating persons as their 
individual tasks. The two tasks are probably 
different to each individual not only because 
of an intrinsic difference in the sequence or 
character of the operations but also because 
the group situation brings in new elements 
requiring the utilization of different abilities. 
The person must anticipate the moves of his 
partner to achieve a smooth performance. In 
short, it is suggested that there may be a 
group of abilities possessed to different de- 
grees by different individuals which determine 
in part how well they will perform in certain 
group situations. These new abilities may be 
independent of those which determine the 
performance of the same operations in the 
same sequence by the individual as a single 
performer. 

A third possible explanation of these re- 
sults lies in the hypothesis of interactions 
among individuals. It may be that some or 
even all subjects will work more effectively 
with some individuals than with others. 
Under this hypothesis, variations in group 
performance may be substantially influenced 
by the extent to which persons are paired 
who will work best together. 
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The second objective of the experiment was 
to determine if the lower individual per- 
former of a pair influenced the group per- 
formance more than the higher individual 
performer. This question may be answered 
by an examination of the correlations of the 
“low” and “high” scores with the “group” 
scores. The correlations, corrected for at- 
tenuation, were .53 and .50, respectively. 
The difference in effect on group performance 
by the “high” and “low” pair members is so 
slight as to be of no practical consequence. 
Thus, group performance here seems to be a 
function of the average of the two individual 
scores. Under these conditions, for a given 
group of workers, there would seem to be 
little to gain by trying to pair off the high 
ones and the low ones, expecting thereby to 
get more over-all production from the group 
as a whole. This conclusion naturally pre- 
sumes a similar type of prediction situation 
and the lack of further information beyond 
that which was available here. 

The same statistical treatment was also 
given to the data from the first two trials. 
Reliabilities were somewhat lower, .76, .82, 
and .74 for the “high,” “low,” and “group” 
scores, respectively. Intercorrelations among 
these scores, corrected for attenuation as be- 
fore, were “low-high,” .53, “low-group,” .67, 
and “high-group,” .58. R* was .51. Thus, 
during the practice trials, the “low” men in- 
fluenced the group performance slightly more 
than they did in later trials. Also, during 
these trials, the group scores were more highly 
related to the individual performance scores 
than during the test trials, as shown by the 
higher R°. The stabilized performance re- 
sults probably have the greater practical 
value, however. 


Summary 


Sixty-five pairs of volunteer male university 
students were given six trials on the Purdue 
Pegboard, Assembly Task, and six trials on 
the Assembly Task with the two members of 
each pair working together on the same as- 
semblies rather than individually on separate 


boards. The members of each pair were di- 
vided on the basis of the total of the last four 
individual trials, Assembly Task, into “high” 
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and “low” categories. Reliabilities were de- 
termined for “high,” “low,” and “group” per- 
formances, using alternate trials and correct- 
ing for doubled length. Correlations of the 
“high” and “low” performances with the 
“group” performance and with each other 
were computed and corrected for attenuation. 
The multiple correlation and _ regression 
weights were obtained for predicting “group” 
performance from “high” and “low” individ- 
ual performances. 

The results showed that less than half the 
group performance variance could be pre- 
dicted from a knowledge of the individual 
performances, even with the effect of er- 
rors removed. It is suggested that manifest 
differences between the “individual” and 


“group” tasks, interactions among _ individ- 
uals, and a constellation of abilities in the 
general area of cooperation may account for 
the variance not predicted by perfectly re- 
liable individual performance scores. 


Andrew L. Comrey 


The level of group performance was only 
slightly more dependent on the “low” individ- 
ual performances. For all practical purposes 
equal weights could be used for “high” and 
“low” scores in predicting “group” perform- 
ance. 

Two practical implications of the results 
of this experiment are as follows. First, in 
industrial situations where two or more in- 
dividuals must cooperate on a given task, it 
must not be assumed that individual perform- 
ances on a similar type of task will account 
for most of the variation in group perform- 
ance. Secondly, for a given group of persons 
there seems to be little point in taking the 
trouble to pair them on individual ability in 
a related type of individual task since group 
performance seems to be dependent on the 
approximate average of their individual scores 
rather than the high or low individual per- 
formance. 


Received August 4, 1952 
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Response Time as an Indicator of Color Deficiency * 
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In 1907 Froeberg (2) reported that reac- 
tion time varied inversely with the intensity 
of the stimulus. Since that time reaction 
time measures have been put to a variety of 
uses. Steinman (6) reported that simple re- 
action time to stimulus change was an ade- 
quate method for studying sensitivity to 
stimuli. She also found that reaction time de- 
creased as the magnitude of the change in- 
creased. In a study whose purpose was to 
determine the speed and accuracy of dis- 
criminations of hue, brilliance, area, and 
shape of visual stimuli, J. B. Reed (4) found 
that as the difference between two areas or 
hues is increased, discrimination time is de- 
creased. 

When pseudo-isochromatic plates are used, 
the subject reacts to a stimulus complex. 
This suggests that if the discrimination re- 
quired is difficult, the response time would be 
longer than if the discrimination were easy. 
Further, if color contrast is absent on the test 
plate the subject would hesitate and seek 
other cues, such as differential brightness, as 
a basis for responding. 

Reports from several investigators lend sup- 
port to this notion. J. D. Reed (5) studied 
reactions to a complex submarine signal panel 
board, and reported that use of reaction time 
measures revealed the increased difficulty of 
discriminations for color defectives. Also 
studies by Pickford (3) and Sultzman (7) 
refer to hesitancy on the part of the color de- 
fective individuals. 

On the basis of these reported observations 
of hesitancy on the part of color defective in- 
dividuals, the following hypothesis was tested: 
color “defective” individuals will have longer 

* The writers would like to express their sincere 
thanks to Lt. Comdr. Dean Farnsworth, M.S.C., 
USNR, Head, Visual Engineering Section, U. S. 
Naval Medical Research Laboratory, New London, 
Conn., for his review of the manuscript. The opin- 
ions or assertions expressed in this paper are those 


of the writers and are not necessarily those of the 
military departments. 


mean plate response times than individuals 
classified as “normal.” 


Method and Procedure 


Test. The color test used in this experi- 
ment consisted of a set of 15 pseudo-isochro- 
matic plates (14 diagnostic and 1 demonstra- 
tion) selected from the American Optical 
Company test ' by Farnsworth and called the 
“Proposed Armed Forces Color Vision Test 
for Screening” (1). 

Subjects. A total of 136 students (108 
male, 28 female) from the University of 
Maryland were used in the experiment. 

Method. Test plates were presented singly 
to the subjects in the order prescribed by 
Farnsworth (1). The subjects were tested 
twice in succession. Half of the subjects were 
given a criterion trial first, while the remain- 
ing half were given a test trial first. The 
criterion trial was conducted exactly as recom- 
mended by Farnsworth (1) except that the 
subjects were instructed to respond to the 
plate as soon as possible. If no response was 
made in 3 sec., the plate was removed. The 
test trial differed from the criterion trial only 
in one respect, i.e., response times were taken 
for each plate in the test trial. 

Apparatus. Mlumination was provided by 
a Macbeth Daylight lamp (No. ADE 10) as 
suggested by Farnsworth (1). The test plates 
appeared against a flat black background and 
were placed on a bracket which slid rapidly 
into a viewing aperture. Presentation of the 
plate started an electric chronoscope cali- 
brated in .01 sec. The subject’s verbal re- 
sponse activated a voice key and stopped the 
timer. Verbal reports and response times as 
described above were recorded. 


Results and Discussion 


The error scores made by the subjects on 
the criterion trial were used to classify the 
1 Pseudo-Isochromatic Plates for 


Perception (Revised Selection, 
Company). 


Testing Color 
American Optical 
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Fic. 1. Mean plate response time frequency dis- 
tribution for normal and color defective subjects 
(N = 136). 


subjects as normal or defective. The norms 
recommended by Farnsworth (1) were used 
in the classification; i.e., four errors or less 
for normals, five errors or more for defectives. 

Using this criterion, 110 subjects (84 male 
and 26 female) were classified as normal. 
The mean plate response time for the normal 
group was .66 sec., ranging from .31 sec. to 
1.42 sec. The defective group had 26 mem- 
bers (24 male and 2 female). Mean plate 
response time for the defective group was 1.93 
sec., ranging from 1.24 to 2.40 sec. Fig. 1 
shows the mean plate response time distribu- 
tion for all subjects, normal and defective. It 
should be noted that subjects classified as 
normal by the criterion test are also cate- 
gorized as normal by mean plate response 
times, and that the deficient subjects, with the 
exception of two cases, are classified as de- 
fective by both error score and response time 
measures. 

The ¢-test was used to test the significance 
of the difference between the means of the 
distributions of mean plate response times for 
the normal and defective subjects. The dif- 
ference was found to be significant at less 
than the .001 level of confidence. 

Table 1 presents a plate-by-plate analysis 
of the errors made by the normal and defec- 
tive subjects. The first analysis of the error 
scores consisted of a t-test to determine 
whether or not a significant practice effect 
existed. We were unable to reject the null 


hypothesis for the defective group, but were 
able to reject the null hypothesis at less than 
the .01 level of confidence for the normal 
group. This could mean that there is no ap- 
preciable change in error scores for defective 
subjects. On the other hand, normal subjects 
made significantly fewer errors on their second 
test trial. 

It may be noted from Table 1 that normal 
subjects in varying numbers missed plates 2, 
6, 10, 11, 14, and 15. Specifically, 51 per 
cent of the normal subjects missed plate 6, 
and 39 per cent missed plate 15 on the first 
trial. Errors were made on every plate ex- 
cept the demonstration plate by the defective 


Table 1 
Number of Errors per Plate by Normal and Defective 
Subjects for the First and Second Trials 


Normal 
(N = 110) 


Defective 

(N = 26) 

Triall Trial 2 Triall Trial 2 
20 22 
25 25 
16 17 
20 23 
25 24 
20 17 


18 
24 


13 14 
18 
23 25 25 


individuals. Plates 2, 3, 5, 6, 7, 11, 14, and 
15, however, were missed by a larger number 
of these subjects than the other plates. 

It was stated earlier that response time 
measures might prove useful as an indicator 
of color deficiency. An examination of Fig. 
1 reveals that the mean plate response time 
frequency distribution of the normal and de- 
fective subjects overlaps in only one interval, 
1.20-1.35 sec. With the exception of the two 
cases represented, all subjects with a mean 
plate response time of 1.50 sec. or over (or 
an over-all response time of 21.0 sec.) could 
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be classified as deficient, and those with re- 
sponse times less than 1.50 sec. as normal. 

It may well be possible that the limit of 3 
sec. employed put an artificial limit on the 
response times. A_ color-blind individual 
would not necessarily, at the end of this 
period make the correct response or any re- 
sponse. On the other hand, a normal or near- 
normal subject, who usually eventually makes 
correct responses, might well be categorized 
by a response time method. For further in- 
vestigation of the practical usefulness of this 
method, the 3-sec. ceiling should be extended. 

The problem in classification is that of the 
near-normal subjects. The number of border- 
line subjects in the present experiment was 
limited, and these borderline subjects con- 
stitute the difficult classification problem. 
Most tests will roughly separate the extremes 
of the population. The problem is to devise 
tests to select the Class II group (mildly de- 
fective color vision) as classified by Farns- 
worth (1). 

Further research is indicated along this 
general line since it is entirely possible that 
response time measures could serve in classi- 
fication. It is possible that memorization of 
the plates could be detected by this method. 
Response time measures could readily be used 
in military and industrial situations. 


Summary 


Response time measures to 15 selected 
plates of the AO pseudo-isochromatic test for 
color perception were secured for 136 college 
students (28 females, 108 males). Of this 
group, 110 were classed as normal, 26 as de- 
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fective, on the basis of their error scores. 
Each subject was given two successive tests: 
criterion and test trials. Mean plate response 
times between the normal (0.66 sec.) and de- 
fective (1.93 sec.) groups were found to differ 
significantly. Practice effects were noted 
within the normal group, but were not found 
in the defective group. It was concluded that 
response time measures could be used in the 
separation of color normal from color defec- 
tive individuals. 


Received August 14, 1952. 
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With equipment-systems periodically gain- 
ing in complexity, both industry and the 
Armed Forces are faced with the task of 
training personnel in the maintenance of these 
equipment-systems. However, the problem 
of how to systematically train maintenance 
men has thus far received meager experi- 
mental treatment. 

It has frequently been postulated that the 
trouble shooting process of locating defects in 
equipment-systems is similar to that encoun- 
tered in problem solving situations. When a 
mechanic is confronted with an equipment- 
system that is “malfunctioning” and no ap- 
propriate response is available, the process of 
locating the defective part clearly possesses 
the properties of a problem situation. Unfor- 
tunately, the gap between the problem solving 
literature and understanding complex trouble 
shooting processes is not easily bridged. Al- 
though several promising concepts are em- 
bodied in the problem solving literature ap- 
parently there is no simple transplanting of 
these notions to problems encountered in the 
trouble shooting of complex equipment-sys- 
tems. 

The purpose of the present experiment was 
to test a concept found important in previous 
problem solving studies by applying it to a 
more realistic problem situation. 

The studies of Maier (2, 3) suggest that a 
knowledge of the required parts of a solution 
does not necessarily mean the occurrence of 
a solution. Evidence is presented that indi- 
cates additional information in the form of a 
set is necessary; necessary in the sense that 
the additional set clears the way and increases 
the probability of a correct solution. 

The hypothesis tested in this experiment 
was that the ability to “trouble shoot” or lo- 
cate defects in an equipment-system entails 


1 The authors are indebted to Mr. Walter Ciszczon 
for material contributions to apparatus, and to Mr. 
Jasper Smaliks for aiding in the derivation of the 
symptom analysis “trouble shooting” method. 


more than being trained in the basic com- 
ponents, or essential parts of the equipment. 

It was postulated that in addition to teach- 
ing basic components something more was re- 
quired in the form of a set of principles deal- 
ing with systematic location of a “malfunc- 
tioning” component. 


Method 


Apparatus. The apparatus in Figure 1 is called 
a gear-train consisting simply of a set of gears 
and shafts mounted on a piece of aluminum 1% 
inch thick, 29 inches in length, and 20 inches in 
width. The gear-trains were arranged to form 
two series and four parallel channels that pro- 
vided for crossed information chains. 

Two operating controls, A and B, provided the 
input necessary to obtain the desired motion. 
The motion was transferred through the gear- 
trains and as an end result closed a switch that 
caused a series of red lights to illuminate the 
control panel. 

The red lights would illuminate only if the 
equipment was functioning properly and control 
A was turned 13 times and control B, 12 times. 
When the appropriate number of turns was made 
and the expected end result (control panel light- 
ing up) did not occur, this indicated to S$ that 
there was a “malfunction” in the gear-train. 

Malfunctions. For the purposes of the experi- 
ment only one class of “malfunction” was uti- 
lized. It was a defect of the “slipping gear type” 
produced by loosening a set screw and found by 
the authors in a previous study (1) to be fairly 
difficult. Each S received six malfunctions of 
this type on the pre-test and six malfunctions of 
the same type on the post-test, making 12 mal- 
functions that each S was required to locate. 

Ss were presented the malfunctions in a ran- 
dom order, and, in addition, each of the malfunc- 
tions was inserted at a location determined from 
a table of random numbers. 

Procedure. Ss were run individually, and im- 
mediately upon entering the laboratory for the 
first time E gave S the Standard Operating Pro- 
cedure (hereafter referred to as S. O. P.) for the 
apparatus. The S. O. P. consisted of turning 
control handle A, 13 turns and control handle B, 
12 turns, and if the red lights on the control panel 
did not light up, it indicated to S that something 
was wrong with the gear-train and the task was 
to “trouble shoot” the equipment. After the S. 
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QO. P. orientation, S was given six malfunctions 
or problems to locate. Each problem was given 
singly, S being placed in an adjoining room while 
the malfunction was being inserted by E. A 
time limit of 15 minutes was allowed for each 
malfunction. 

When each S had completed working on the 
initial six “malfunctions,” the following procedure 
was followed. Ss in Group 1 were taken to an 
adjoining room for a 20-minute period during 
which they were allowed to read a current Life 
magazine. Ss in Group 2 received a tape-re- 
corded “basic knowledge” lecture. Integrated 
with the lecture were slides projected on a screen 
with a 35 mm. camera. The rationale of the 
lecture was to convey the basic nomenclature 
and function of the gear-train apparatus. In- 
cluded were such concepts as transfer of motion, 
and the function of gears, bearings, and shafts. 
Group 3 received the basic knowledge information 
given to Group 2, and, in addition, was given a 
tape-recorded lecture on how to “trouble shoot” 
the gear-train. This “trouble shooting” lecture 
was based on a simplified version of a previously 
developed symptom analysis guide to “trouble 
shooting.” The lecture stressed starting with the 
greatest magnitude of error, locating the first cor- 
rectly operating component nearest the greatest 
error, and once these two points were bracketed. 
locating the defect somewhere between.* 


2 The symptom analysis “trouble shooting” lecture 
has been filed with the American Documentation In- 
stitute. Order Document 3968 from ADI Auxiliary 
Publications Project, Photoduplication Service, Li- 


Fic. 1. 
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Subjects. The Ss were 54 college students en- 
rolled in the School of Education at Indiana 
University. Three groups of 18 Ss each were 
used in the experiment. Assignment of the 54 
Ss to each of the three groups was done from a 
table of random numbers. Participation in the 
experiment was required in order to eliminate the 
bias often found by asking for volunteers. 


Results 


The raw data for this experiment are the 
post-test gains or the number of malfunctions 
correctly located minus the number located 
on the pre-test, and the total time required 
to reach a decision as to the location of the 
malfunctions. Arriving at a decision, how- 
ever, does not necessarily mean that it was a 
correct one. 

Figure 2 illustrates graphically the post- 
test gains made by the three groups of sub- 
jects in “trouble shooting” the gear-train ap- 
paratus. Statistical significance of post-test 
gains was tested by an analysis of variance 
of the thirteen possible scores ranging from 
6 to — 6. 


brary of Congress, Washington, D. C., remitting 
$1.25 for microfilm (images 1 inch high on standard 
35 mm. motion picture film) or $1.25 for photoprints 


The gear-train apparatus. 
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GAIN IN “MALFUNCTIONS” LOCATED 








a 
1| 
GROUP 2 GROUP 3 


Post-test gains in malfunctions correctly 
located over the initial measures. 


GROUP | 
Fic. 2. 


Prior to computing the analysis of variance, 
Bartlett’s y? test was utilized to test the 
homogeneity of variance of defects located. 
Since the y* value of 2.26 did not reach the 
5% level (df = 2), the hypothesis of no dif- 
ferences among group variances could not be 
rejected. 

The analysis of variance performed on post- 
test gains is summarized in Table 1. The ob- 
tained F value of 9.36 for 2 and 51 degrees 
of freedom was significant at the 1% level of 
confidence. , 

It appears, then, that the significant gains 
demonstrated by the performance of Group 3 
subjects can defensibly be attributed to the 
effects of the “trouble shooting” lecture that 
the two remaining groups did not receive. 

However, a glance at Figure 3, showing the 


Table 1 


Analysis of Variance of Differences Between Initial 
and Test Location of “Malfunctions” 


Sum of | Mean 

Squares Squares F 
27.88 13.94 9.36* 
76.01 1.49 


Source of Variance’ df 


Between Groups 2 
Within Groups 51 


Total 53 103.89 





* Significant beyond the 1% level of confidence. 


Table 2 


Covariance Analysis for Time to Decide the Location 
of Pre- and Post-Test Malfunctions 








Sum of 


Source of Variance Squares 





Total 
Within Groups 


618.37 
531.79 


86.58 


Adjusted Means 





* Significant at the 5% level of confidence. 


time in minutes for the groups to reach a de- 
cision with respect to where various malfunc- 
tions were located, points out an interesting 
reversal. Although Group 3 was superior in 
locating defects under test conditions, it is 
clear they failed to decrease subsequent “trou- 
ble shooting” time, while the time required by 
the remaining two groups was reduced. In 
order to test for differences with regard to 
time taken to reach a decision as to the loca- 
tion of defects, an analysis of covariance, 
shown in Table 2, was carried out between the 
pre-test and post-test time measures. 

The F of 4.07 was significant at the 5% 
level indicating that the means of the group 
on the post-test time measures cannot be ac- 
counted for by differences in mean level of 


@ PRE-TEST 
O POST-TEST 


132.8 


2.6 12.5 


LOCATION TIME IN MINUTES 

















GROUP | GROUP 3 


Fic. 3. Comparison between the pre- and post- 
test of total time required to decide the location of 
malfunctions. 


GROUP 2 
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initial ability as measured in the pre-test 
trials. 

Accordingly, the results of this study are 
that the additional “trouble shooting” lec- 
ture acted to produce a differential effect on 
subsequent performance in locating gear-train 
defects. The group which received both the 
basic knowledge and “trouble shooting” sets 
did not appreciably reduce the time scores in 
comparison with the remaining groups. 

Time is a rather dubious criterion of per- 
formance in the trouble shooting situation. 
A comparison of Figures 2 and 3 (gains and 
time) suggests that the longer time required 
by Group 3 may be attributed to deliberation 
required for an accurate judgment, while in 
Group | the small time required might be at- 
tributed to snap judgment. 

In the final analysis, the findings suggest 
that besides learning about various com- 
ponents of an equipment-system, systematic 
training in “trouble shooting” methodology is 
required in order to obtain efficient results. 

Intensive research with more complex equip- 
ment is needed to determine the additional 
skills and knowledges required to “trouble 
shoot” successfully. Such a problem as de- 


termining which trouble shooting procedure 
is more generalizable than another, and test- 
ing its transfer power on succeedingly com- 
plex equipment is, indeed, a fascinating lab- 


oratory challenge. By using the laboratory 
method on these and related problems, much 
useful information about problem solving in 
general, and the significant variables of ‘“‘trou- 
ble shooting” in particular could be systema- 
tically obtained. 


Summary 


The experiment reported tested the hypoth- 
esis that ability to “trouble shoot” or locate 
defects in a specified equipment-system re- 
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quires something more than being trained in 
the parts or components of that equipment- 
system. Fifty-four undergraduate students 
enrolled in the School of Education at Indiana 
University participated in the experiment. 
Certain training factors were common to the 
three groups. All Ss received identical in- 
doctrination in the Standard Operating Pro- 
cedure for a gear-train apparatus, after which 
each S was given six problems or malfunctions 
to locate in the equipment. This procedure 
was used to obtain a pre-test measure of 
“trouble shooting” ability on the gear-train. 
After the initial measure, Group 1 received 
no further information, Group 2 received a 
tape-recorded basic knowledge lecture that 
explained the nomenclature and functioning 
of the gear-train, while Group 3 received the 
basic knowledge lecture plus symptom analy- 
sis lecture designed to aid in “trouble shoot 
ing” the gear-train apparatus. 

The post-test gains indicate that the addi 
tional “trouble shooting” lecture acted to 
produce a significant gain in malfunctions cor- 
rectly located. Time required, however, de- 
creased for Groups 1 and 2, but remained con- 
stant for Group 3. It is suggested that time 
is a dubious criterion of “trouble shooting” 
performance. 


Received March 6, 1953. 
Early publication. 
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Aids to Jet Pilots * 


John E. Murray 


Dunlap and Associates, Inc., Stamford, Connecticut 


A modern trend in the progress of aviation 
is toward the provision of improved facilities 
for the pilots of high-speed, high-altitude air- 
craft. One of the basic tools required by such 
pilots is the aeronautical chart. A review of 
the literature and an examination of current 
charts show that existing charts fail, in some 
respects, to provide the pilots of high-speed, 
high-altitude aircraft with a highly effective 
navigational tool. Much of the material pre- 
sented on the charts is superfluous: some of 
the natural features cannot be seen from high 
altitudes, and much of the chart content can- 
not be absorbed in the time available for 
navigation at high speeds. 

An increase in the number of charts was 
not accompanied by a judicious selection of 
the chart content. As more aeronautical in- 
formation became available, it was added to 
the basic chart without consideration of the 
flight and navigational requirements of mod- 
ern planes and air operations. 

This procedure resulted in the production 
of all-purpose charts: charts for use in any 
aircraft on any type of flight. These charts 
are cluttered with information, difficult to 
read and inconvenient to use because of their 
size. To overcome some of these defects, 
special purpose charts were designed but with- 
out the use of adequate criteria in the selec- 
tion and presentation of information. More- 
over, there seems at present to be no well 
established methods by which aeronautical 
charts can be evaluated. Even more striking 
is the fact that, in the history of chart pro- 
duction, very little systematic study has been 
made of the pilot’s task of interpreting the 
information presented on the charts. Con- 
sequently, the major objective of this study 
was to devise experimental techniques ap- 


* This research was supported under the terms of 
the contract between the Office of Naval Research, 
and Dunlap and Associates, Inc., Contract Number 
N8onr 641-05. This paper is a summary of Report 
No. 641-05-6 under that contract. 
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plicable to the evaluation of principles of 
chart construction. 

From the point of view of the psychologist, 
this study is valuable in that it demonstrates 
the application of psychological methodology 
to the problem of chart evaluation. From the 
cartographer’s viewpoint, the experimental 
evaluation of charts yields two types of in- 
formation which can serve in the future course 
of chart development: (1) a test of the ap- 
plicability of general principles and tech- 
niques of chart construction; and (2) the rela- 
tive value of different methods of presenting 
information. 

In order for information to be used most 
efficiently by the pilot, it should obviously be 
displayed so as to provide maximum legibility. 
This involves a determination of the best 
method of presenting chart information and 
requires a study of the contributions of color, 
type of symbol, size and style of printed type, 
and other related items to legibility. 

There are two basic ways in which a chart 
can be evaluated. One is subjective and de- 
pends upon pilots’ opinions which can be 
gathered from interviews and systematic ques- 
tionnaires; the other is objective and requires 
the collection of performance test data of 
various sorts. Both methods have been em- 
ployed in the present study. The design of 
this study involves the following steps: 

1. The selection of specific features to be 
included and evaluated on experimental 
charts. 

2. The preparation of tests to measure the 
readability of charts. 

3. The preparation of a test to measure the 
effectiveness of charts in representing what 
the pilot sees from the air. 

4. The construction of a questionnaire to 
determine the pilot’s attitude toward experi- 
mental charts in terms of their content and 
practicability for actual flight conditions. 

The results of the first step are embodied 
in the structure of two experimental charts. 





Evaluation of Two Experimental Charts 


These charts differ in the amount, kind and 
method of presenting information to the pilot. 
The significant differences between the two 
experimental charts are displayed in Table 
1. The precise objective of this study was to 
determine the relative effectiveness of the 
present World Aeronautical Chart (WAC) 
and the two experimental charts, the XJN 
Chart produced by the Aeronautical Chart 
and Information Service and the XDA Chart 
designed for the Office of Naval Research. 
Representative samples of the three charts 
are presented in Figure 1. 


Evaluation Procedures 


Readability Tests. Tests were designed to 
determine the speed and accuracy with which 
pilots can find and use information contained 
in the charts. Given the task of reporting 
certain specified information, the speed and 
accuracy of performing this task with each 
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chart can be taken as an index of the effec- 
tiveness of their presentation. In effect, these 
are tests of legibility or ease of reading. This 
legibility is a function of type size, symbols 
used, color and the density of the information 
shown on the chart. The relative superiority 
of the charts can be determined for those fea- 
tures in which they differ. The following 
tests of readability were constructed: 

Part 1. Airport Information. Flight lines 
were drawn connecting fourteen airports. 
The subject was required to give the airport 
type, elevation, runway length and available 
electronic facilities for each of the airports 
specified. The maximum possible score on 
this test was 50. 

Part I]. Radio Information. Similar flight 
lines were drawn connecting various radio aids 
on each chart and the subject was required to 
give the type, frequency and call letters for 
each radio aid specified. The maximum possi- 
ble score was 42. 


Table 1 


Differences in the Presentation of Specific Features on the XJN and XDA Charts 


Feature 


Front of Chart! 
1. Color of land and water areas 
2. Terrain features 
Contour lines 


XJN 


Yellow and blue 
Hypsometric tints 


XDA 


Green and blue 
Shadient tints to approximate three 
dimensional view 


Cities 
Transportation lines 


Airports 


. State names and boundaries 


Colors of symbols 


. Navigation light lines 


Distance scale 


Back of Chart? 


10. 
11. 
a2. 


Radio beacon 
Broadcasting station 
Radio range 


Spot elevations 

Predominant in yellow 

Roads and railroads differen 
tiated 

Jet and military airports shown 
by runway pattern; civil 
shown by circle; lighting 
and surface facilities indi 
cated 

Shown by dotted lines 

Both airports and radio infor 
mation in magenta 

Shown by solid lines 

Along edge from 0 to 1,000 
miles 


Symbol prominent 
Symbol subdued 
Shows N quadrant 


Highest peak only 

Subdued in gray 

Roads and railroads indicated by same 
symbol 

All airports shown have adequate light 
ing and hard surface runway and are 
represented by runway pattern; type 
of airport shown in data note; GCA 
and DF facilities indicated 

Not shown on this chart 

Airport information in blue; radio in 
formation in an improved magenta 

Not presented 

Starts from 0 at either end toward 500 
in center; in bold type on both sides 
of chart 


Symbol subdued 

Symbol prominent 

Differentiates terminal and non-termi 
nal ranges; shows inbound magnetic 
headings 


‘On the XJN chart, radio and airport information are presented in the same color and the symbols for each 
differ on the front and back of the chart; on the XDA, airport and radio information are differentiated in color 
but the symbols for each type of data are consistent on both the front and back of the chart. 

2 On the back of the chart, XJN shows Morse code, reporting points, fan markers, dumb-bell markers, airways; 
XDA does not present these features but includes the Atlantic coast line and a list of YG stations. 
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Part 111. Natural and Cultural Features. 
The subject was required to read and inter- 
pret various features pertaining to terrain, 
roads and railroads, cities, rivers, etc., used 
for navigation in cross-country flight. The 
maximum possible score was 15. 

Part IV. Aerial Photographs. This test 
was designed to measure the individual’s 
ability to read an aerial photograph and to 
determine its geographic location on the chart. 
Seven photographs were selected from a series 
taken at an altitude of approximately 40,000 
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Sample sections of the WAC, XJN, and XDA charts. 


feet on a flight from Dayton, Ohio to Wash- 
ington, D. C. The subject was required to 
locate the area pictured in each photograph 
on the test chart provided. 

Each experimental session was prefaced by 
an introductory statement covering the pur- 
pose of the study and the experimental pro- 
cedure to be followed. Time limits of five 
minutes each were imposed on Parts I and IT: 
seven minutes each were allowed on Parts 
III and IV. Each session required approxi- 
mately 45 minutes of which 24 minutes were 
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for working time and the remainder for pre- 
liminary instructions. 

The tests were administered to groups of 
20 to 25 pilots at each session. <A total of 
72 Navy pilots were tested on the XJN Chart, 
66 on the XDA Chart and 60 on the WAC 
chart for a total of 198 pilots. 

Item Analysis. As a more refined measure 
of the effectiveness of the charts, each of the 
four tests was subjected to an item analysis. 
The number of individuals who marked each 
item correctly was determined and a com- 
parison among the three charts was made on 
each item. In those instances where items 
were incorrectly marked, the frequency with 
which other alternatives were chosen was re- 
corded. 

Questionnaire. To elicit pilot preferences 
for specific features on the experimental 
charts, a questionnaire was distributed to 
another group of Navy pilots. The question- 
naire consisted of a series of questions con- 
cerning the features which were differently 
presented on the two charts. In each ques- 
tion, the pilot was asked to state his prefer- 
ence for one of the charts in regard to some 
specific feature. Where applicable, reasons 
for the choice or preference were also re- 
quested. Free comments, whether favorable 
or unfavorable, were encouraged as much as 
possible. In all, 43 pilots were interviewed 
with the questionnaire either on an individual 
basis or in small groups of two to four men 
each. 

Results 


Readability Tests. The mean score ob- 
tained on each test for each chart is presented 
in Table 2. To determine the effectiveness of 
each of the charts, the test scores were com- 
pared by the standard ¢-test techniques. The 
test results indicate that airport, radio and 
cultural information can be read more quickly 
and accurately on the experimental charts 
than on the traditional WAC chart. The 
XDA is significantly superior to the XJN 
Chart in presenting airport information. This 
superiority is probably due to the prominence 
of the airport symbol and the simplicity of 
the corresponding data note. 

Only minor differences exist among the 
charts when used to identify locations of 
aerial photographs. This finding would im- 


Table 2. 


Mean Scores Obtained on Readability Tests 
for the Charts Specified 


No. in 
Group 


Standard 
Chart Mean Deviation 
Test I 
Airport Information 
60 37.1 
66 46.3 


72 43.0 
Test Il 


Radio Information 
60 34.5 
65 39.9 
71 390 


Fest HI 
Cultural Features 
WAC 60 
XDA 66 
XJN 72 


Test IV 
\erial Photographs 
WAC 60 3.05 
XDA 66 3.23 
XIN 72 3.47 


1.22 
0.97 
0.76 


ply that the reduction in the amount of detail 
on the experimental charts does not hinder 
the pilot’s identification of reference points. 

Item Analysis. The data from the item 
analysis clearly show the relative value of 
each of the charts as a means of presenting 
information to the pilot. Economy of space 
does not permit the inclusion of the data ob- 
tained for each item. The important differ- 
ences among the charts can be summarized as 
follows: 

1. In the time limit allowed, fewer items 
were completed on the WAC Chart than on 
either the XDA or XJN Charts. This differ- 
ence seems to be due to the mass of informa- 
tion shown as well as to the unsystematized 
placement of the data notes on the WAC 
Chart. Furthermore, the size and scale of 
the WAC Chart make it awkward to manipu- 
late and difficult to locate the information 
required. 

2. The runway patterns on the XDA and 
XJN Charts were more effective than the 
traditional circular symbols on the WAC 
Chart. 
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3. Data notes are more readily identified 
when placed closely to their related objects. 
Misidentification of certain airports on the 
XDA Chart, for example, resulted from im- 
proper placement of the data notes pertaining 
to these airports. 

4. Security areas are best shown on the 
XJN Chart. This seems to be due to the 
type size and face used in presenting these 
areas. 

5. In presenting terrain features, the XDA 
Chart is superior to both the XJN and WAC 
Charts. 

Questionnaire. The preferences of pilots 
for the specific features on the two experi- 
mental charts are as follows: 

1. Printed material on the XDA Chart is 
more easily read although the chart has a 
more cluttered appearance. 

2. The mileage scale on the XDA Chart is 
preferred. The scale should range from 0 


500 miles from either end of the chart and it 
should be presented in the same manner on 
both sides of the chart. 

3. Runway patterns on the XDA Chart are 
preferred by 93 per cent of the pilots inter- 
It is considered desirable to present 


viewed. 
only those airports with adequate landing 
facilities for jet aircraft. 

4. The bold type for airport information 
on the XDA Chart is preferred and GCA and 
DF facilities are highly desirable. 

5. The radio broadcast symbol on the XJN 
Chart is preferred. It can be distinguished 
easily from the other radio symbols. 

6. On the back of the chart, the radio 
beacon symbol appearing on the XJN Chart is 
preferred; the radio broadcast symbol ap- 
pearing on the XDA Chart is preferred. 

7. Range stations are considered the most 
important radio aids to navigation. 

8. In presenting terrain features, pilots 
prefer the shadient tints of the XDA Chart 
but with the spot elevations of the XJN 
Chart. 

9. Pilots prefer the presentation of large 
cities in yellow as shown on the XJN Chart. 

i0. The names of cities are more easily 
read on the XJN Chart. This seems to be 
due to the contrast between the black print 
and the yellow background of the land area. 
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11. Pilots prefer the differentiation of roads 
and railroads as shown on the XJN Chart. 

12. Radio information is preferred on both 
sides of the chart but the symbols should be 
consistent on both sides. 

13. The inbound magnetic headings on the 
range legs of the XDA Chart and the indica- 
tion of the “N” quadrants on the XJN Chart 
were both highly favored. 

14. Coastal outlines, cities, roads and rail- 
roads, and terminal ranges are desirable on 
the chart; non-terminal ranges are preferred 
in a less prominent form. 

15. Airways, fan markers, state names and 
boundaries are of minor importance to the 
pilot and need not be shown on the chart. 

16. The size and scale of the two experi- 
mental charts are satisfactory but a new 
chart combining the best features of both is 
highly desirable. 


Summary 


The major objective of this study was to 
devise and apply experimental techniques 
through which data could be obtained and 
form the basis on which principles of chart 
construction could be evaluated. Some of 
these principles seem obvious but until experi- 
mental data were available, they remained in 
the realm of conjecture. 

In evaluating the charts, data were ob- 
tained from readability tests, an analysis of 
test items and pilot preferences on a ques- 
tionnaire. The results indicate that the two 
experimental charts designed for navigation 
in high-speed, high-altitude aircraft are su- 
perior to traditional charts in presenting in- 
formation for cross-country missions. On an 
over-all basis, the experimental charts are not 
statistically different from one another. How- 
ever, there are several features on each chart 
which appear to be highly effective in pre- 
senting navigational information to the pilot. 

It seems apparent, therefore, that the ideal 
chart for navigation in high-speed, high-alti- 
tude aircraft should include the desirable fea- 
tures of each chart with further experimenta- 
tion to determine the effectiveness of their 
interaction. 


Received July 7, 1952. 
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The present problem is concerned with 
visual acuity at various brightness levels, but 
from a somewhat different point of view from 
that taken in the usual psychophysical experi- 
ment. Classical studies (4, 5) have described 
visual acuity as a function of brightness level. 
In such studies, interest is in mean or typical 
performance. Individual differences are re- 
garded as variability limiting the generality 
of the findings. In the present problem, we 
are interested in assessing the possibility of 
constructing a practical test of night visual 
acuity. In this regard, we are not concerned 
with mean performance, but attention is 
strongly centered on individual differences. 

The Personnel Research Section of the 
Adjutant General’s Office has been under- 
taking, for some time, the development of a 
practical, predictive test of night visual per- 
formance. In early studies carried out at 
Fort Sill (3) in 1944, and at Camp Blanding 
(10), the Army Night Vision Tester (ANVT-— 
R2X) was constructed and validated. The 
instrument is satisfactory from the point of 
view of reliability and validity, but it has 
shown itself too cumbersome for general field 
service. 

The present approach of the Personnel Re- 
search Section to night vision testing attempts 
to substitute an acuity test given at mesopic 
(moonlight) brightness (6.75 log micromi- 
crolamberts) for the scotopic (starlight) test. 
This substitution is desirable because tests of 
mesopic acuity involve less adaptation time 
(and hence more rapid testing), less depend- 
ance on light-tight testing conditions, and 
fewer testing personnel. In the practical mili- 
tary situation, these factors might well be 
critical in determining whether or not a test 
of night vision could be adopted for extensive 
use. 

* The opinions presented in the paper are those of 
the authors and do not necessarily reflect the views 
of the Department of the Army 


A mesopic acuity test may be substituted 
for a scotopic test, if the relationship between 
the two tests is shown to be high. Studies 
reported in the experimental literature indi- 
cate that visual acuity scores are correlated 
at certain brightness levels. The closer the 
brightness levels tested, the higher has been 
the correlation reported. 

The relationship of acuity at photopic (day- 
light) and scotopic brightnesses has been in- 
vestigated in two studies. Uhlaner and 
Woods, 1951 (7), employing 200 subjects, re- 
ported correlations ranging from .19 to .39 
between various photopic acuity tests given 
at 10.02 log pp»L. and scores on the Army 
Night Vision Tester given in the brightness 
range of 3.51 to 5.26 log pul. Warden, 1944 
(8), however, found biserial correlations of 
only .02 between scores on the Navy Radium 
Plaque Adaptometer at 3.94 log ppyL., and 
scores on a Snellen test given at standard pho- 
topic brightnesses. The restriction of range 
on the photopic variable may partially ex- 
plain the low correlation attained in this 
study. The 100 subjects tested all had pho- 
topic acuities of 20/20 or better. 

Two other studies have been concerned with 
the relationship of acuity scores taken at ad- 
jacent brightness levels in the photopic-meso- 
pic range. L.S. Rowland (6) compared acui- 
ties at 10 log ppl., 7.6 log ppl. and 6.5 log 
ppL. brightness levels, employing 56 subjects. 
The tetrachoric correlations between acuities 
at these levels (computed by the present au- 
thors) are: 10 vs. 7.6 log pyL., r = .61, 10 vs. 
6.5 log ppL., r= .73, 7.6 vs. 6.5 log ppL., r 
= .61. Feinberg and Wirt (2) found the in- 
tercorrelations of scores of far visual acuity, 
measured 100 subjects on the Bausch and 
Lomb Ortho-Rater checkerboard target, at 
brightnesses ranging from “normal” to 1/33 


of “normal” to range from .71 to .90. Gener- 





224 


ally, the closer the levels compared, the higher 
the correlation attained. 

The present problem extends these analyses 
to a comparison between scotopic visual acuity 
and acuity at photopic and mesopic brightness 
levels. From the viewpoint of assessing the 
practicality of developing mesopic tests to 
measure scotopic acuity, the present study is 
ctucial. The feasibility o. this approach 
would be demonstrated if indications of suf- 
ficiently high correlations could be shown be- 
tween scotopic and mesopic acuity, and if 
these correlations were substantially higher 
than those between scotopic and photopic 
acuity. 

Method 

Apparatus. The scotopic measurements were 
made on the Army Night Vision Tester (ANVT- 
R2X,7). This instrument presents a black, two- 
degree Landolt Ring against a four-degree white 
background. The intensity of illumination is 
varied through eight steps of decreasing bright- 
ness, by placing filters over the self-luminous 
radium plaque background. Brightness varies in 
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these steps between 5.26 and 3.51 log ##L. The 
subject is required to indicate which one of eight 
positions the break in the ring is facing. Eight 
presentations of the stimulus are given in ran- 
dom order at each brightness level. 

The photopic and mesopic acuity measurements 
were made on wall charts and on the Bausch and 
Lomb Ortho-Rater instrument. All tests were 
conducted in the Pentagon Vision laboratory. 
This laboratory was standardized in conformity 
with specifications prescribed by the Armed 
Forces-National Research Council Vision Com- 
mittee. The layout at this laboratory is shown 
in Figures 1 and 2. 

The wall charts employed included the Modi- 
fied Landolt Ring, Army Snellen, Line Resolu- 
tion, and Quadrant Variable Contrast targets 
(Figure 3). Except for the Army Snellen, these 
charts were developed by the Personnel Research 
Section and were utilized in an earlier factor 
analysis study of photopic visual acuity. 

Photopic and mesopic acuity measurements 
were also made by means of the Ortho-Rater in- 
strument. The optical system of this instrument 
presents the test target at an apparent distance 
of eight meters (1). In the present study, only 
the far visual acuity adjustment was employed. 
Control of the voltage input of the Ortho-Rater 
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was maintained by means of a variac, a continu- 
ously variable resistance. The voltage was regis- 
tered on a voltmeter in parallel with the Ortho- 
Rater and was periodically checked for deviations 
from normal. The variac was also employed to 
obtain the desired levels of photopic and mesopic 
brightness at which testing took place. A slight 
reddening of hue was found at the lower levels 
of illumination; this may have affected the sub- 
jects’ responses. It is assumed that the effect 
on responses of this change in hue was negligible 

The Johns Hopkins acuity plates developed by 
Dr. Louise Sloan were employed as test targets 
in the Ortho-Rater. These targets consist of 15 
lines of block letters. The letters in each row 
are of equal size, equal width of stroke, and 
equal spaces between letters. The size of the 
letters and the width of stroke decrease for suc- 
ceeding rows from the top to the bottom of the 
chart. There are five letters in the first row and 
ten letters in the remaining rows. The same 
letters, arranged in different order for each row, 
are used. The letters range in size from 20/200 
Snellen to 20/13 Snellen. 

The brightness levels of the Ortho-Rater were 
calibrated by use of the Macbeth Illuminometer 
and the Taylor Low Brightness I!luminometer 
In using the Macbeth Illuminometer, the instru- 
ment was sighted directly at a blank glass plate 
set in the target position of the Ortho-Rater 
The variac was adjusted until the plate equalled 
in brightness the pre-set level of the Hluminom- 
eter. In making this adjustment, the required 
brightness was corrected to compensate for loss 
of light at the eyepiece of the Ortho-Rater. The 


Layout of the Pentagon Vision Laboratory, front view 


Taylor Low Brightness Iluminometer was sighted 
at the blank glass plate through the eyepiece of 
the Ortho-Rater. In all cases the variac settings 
were independently checked by several observers. 

Subjects. A total of 19 staff members of 
the Personnel Research Section were previously 
tested in December 1949 on the Army Night 
Vision Tester and were retested for this study in 
December 1950. Sixteen subjects from this group 
were used for one part of the analysis and 15 
were used for another part of the analysis. The 
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subjects were not identical for all the tests. Sub- 
jects were selected to sample a wide range of 
scotopic acuity. 

Procedure. ‘Testing on (a) wall charts, (b) 
monocular Ortho-Rater plates, (c) first binocular 
plates, and (d) second binocular plates occurred 
in separate sessions. A month intervened be- 
tween (a) and (b), a week between (b) and (c) 
and a month separated (c) and (d). Scores on 
the Army Night Vision Tester had been obtained 
about a year prior to the commencement of the 
present study. All testing was conducted with 
corrected vision on those subjects who custom- 
arily wore glasses. The procedure involved in 
testing with wall and the Ortho-Rater plates will 
be described separately. 

Wall Charts. Each subject was dark adapted 
for 10 minutes in the testing room which was 
darkened to approximately .001 foot-lamberts 
brightness. This length of dark adaptation is 
sufficient to allow valid visual acuity testing to 
be carried out at the lowest brightness level uti- 
lized (Level 8). The tests were observed bin- 
ocularly in the following order: Modified Landolt 
Ring, Army Snellen, Line Resolution, and Quad- 
rant Variable Contrast. Testing continued on 
each subject until he had made three consecutive 
errors. After the scores were recorded, the light 
level was adjusted to the next higher brightness 
level (Level 7); the subject was given an adapta- 
tion period ranging from 15 to 30 seconds, and 
testing again took place in the same order as in 
Level 8. This procedure was followed for the 
remaining six levels. The eight levels of illumi- 
nation employed are shown in foot-lambert and 
log ##L. See Table 1. 

Total time for each subject in each session was 
approximately 15 minutes. 

Ortho-Rater Tests. The procedure employed 
in administering the Ortho-Rater tests was simi- 
lar to that employed with the wall charts, except 
that only a single type of test target (letters) 
was administered at each brightness level. 

In the monocular testing, the subject was first 


Uhlaner, Gordon, Woods, and Zeidner 


tested with the right-eye target at the lowest 
level of illumination and then with the left-eye 
target at the same level of illumination. As in 
the wall-chart procedure, the level of illumination 
was raised to the next higher level and the sub- 
ject was given 15 to 30 seconds to adapt to the 
higher level. The subject was tested at this 
level with the right-eye target and then with the 
left-eye target. This procedure was repeated in 
the same manner for the remaining six levels of 
illumination. 

Testing at light levels 1, 2, and 8 was omitted 
from the first binocular test. A preliminary 
analysis of the monocular data indicated that the 
targets available did not adequately discriminate 
between subjects at these levels. 

All illumination levels were included in the 
second binocular test because a new target was 
employed. With the inclusion of this new target 
test, no inferences from the monocular data could 
be drawn as was the case for the first binocular 
test. 


Results 


Wall Chart. The relationship between 
scores on the scotopic test and on each of 
the wall chart tests at the brightness levels 
tested is given in Table 2. The Quadrant 
Variable Contrast test was discarded as it 
failed to differentiate between subjects. The 
items of this test were too difficult for the best 
subjects. Scores on the Army Night Vision 
Tester were number of correct responses in 
the 64 presentations constituting the test. 
Scores on the wall charts were number correct 
to three consecutive errors. Rank order cor- 
relations are shown in Table 2. 

The smoothed scores were obtained by fit- 
ting through each individual's scores at the 
various brightness levels, a curve similar in 


Table 1 


Levels of Hlumination Employed 








Wall Chart Tests 


Level of — 
Illumination Ft.-Lamberts 


Log wuL. 


Ortho-Rater Tests 


Ft.-Lamberts Log pul. 
13.0 10.18 

3.0 9.51 

.850 8.96 

070 7.85 

OZ 7.29 

.009 6.96 

003 6.51 

001 6.03 
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Table 2 
Correlations of the Army Night Vision Tester with Raw and Smoothed Wall Chart Test Scores 


N = 15 


Mod. Landolt 
Level of 
Illumination Raw 


58 .69 
61 08 
59 65 
69 62 
00 1 
62 87 
1 82 

8 57° .62* 


Smoothed 


Raw 


Army Snellen Line Resolution 


Smoothed Raw Smoothed 


42 37 a 44 
44 Al A7 
47 41 5. 62 
56 58 8 60 
ae 57 69 
61 51 . 62 
35 63 2 58* 
63* 63* 04* - O1* 


* The relationships implied by these correlations must be accepted with reservation as the mesopic tests upon 


which they are based showed inadequate differentiation and variance at these levels. 


significant at the 1 per cent level. 


shape to the function which seemed to repre- 
sent the relationship between acuity and 
brightness based upon the observations for 
15 subjects. These smoothed scores represent 
an attempt to get scores in which the error 
variance is minimized. Similar logic is im- 
plied in all methods of curve fitting. 
Ortho-Rater. The relationship between 
scores on the scotopic test and the Ortho- 
Rater scores is given in Table 3 below. Scores 
on the Ortho-Rater are based on the number 
of rights to three consecutive errors. Best 
Eye “A” is defined as scores on the eye which 
gave best acuity on the majority of brightness 
levels tested. Best Eye “B” is defined as 
scores on the eye which gave best acuity at 


Correlations of .51 are 


each level. For the first binocular target, 
testing was carried out only for Levels 3 
through 7. 


Discussion 

Relationship of Scotopic and Higher Bright- 
ness Scores. A trend is found for higher cor- 
relations to occur between the wall charts and 
the Army Night Vision Tester at the lower 
brightness levels (Table 2). Highest cor- 
relation (raw) with the Army Night Vision 
Tester occurred at Level 7, 6.51 log ppL., for 
the Modified Landolt Ring, at Level 6, 6.94 
log pp». for the Army Snellen, and at Level 
5, 7.33 log pul. for the Line Resolution test. 
The correlations of the Ortho-Rater with the 


Table 3 


Correlations of the Army Night Vision Tester with Ortho- Rater Test Scores 


Level of 


Best Eye 
Illumination *“_ 


18 


Best Eye 
“ ’ 


N = 16 


Sec ond 
Binocular 


First 
Binocular 


40 

54 : 
40 22 
51 A3 
43 33 
- 35 


* Inadequate differentiation of subjects was shown by the tests at these levels. Correlations of .50 are sig 
nificant at 5 per cent level. 
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Army Night Vision Tester show an increase 
as brightness levels are increased to Level 4 
(7.85 log pp»L.) and a decrease at higher 
brightness levels. Highest correlations with 
the ANVT-R2X are obtained at this level for 
Best Eye “A” and “B” and for first binocular 
scores. The second binocular scores show 
highest correlation at Level 6. 

It might reasonably have been expected 
that scores on the Army Night Vision Tester 
would correlate most highly with tests ad- 
ministered at the lower brightness levels, i.e., 
Levels 7 and 8. Failure to obtain this result 
here may perhaps be explained by the un- 
suitability of the wall charts and Ortho-Rater 
targets employed for testing at the lower 
brightness levels. Correlations appear to in- 
crease up to the point where these targets can- 
not be seen by the subjects. 

The alley charts correlate more highly with 
the Army Night Vision Tester than do the 
Ortho-Rater plates (Tables 2 and 3). This 
result may be attributed to the superior acuity 
distributions at the low brightness levels ob- 
tained on the charts. It should be noted that 


one of the wall charts used the Landolt broken 
ring design which is similar to the target used 


in the Army Night Vision Tester. The 
specificity of the Landolt target may have 
increased the correlations. Further study 
should be made to determine whether or not 
the ring gives high correlations with scotopic 
tests of other designs. 

These results would raise doubt concerning 
the allegation that scotopic visual acuity 
scores are too unstable to permit their long- 
term prediction. In the present study, the 
Army Night Vision Tester was administered 
to the subjects a full year before the photopic 
tests. Despite this time difference, correla- 
tions of .60 and higher are found between the 
Army Night Vision Tester and the mesopic 
tests. 


Summary 


The aim of this study was to determine the 
correlations among scores on a scotopic visual 
acuity test and scores on wall charts and 
Ortho-Rater plates administered at various 
photopic and mesopic brightness levels. Nine- 
teen subjects were employed, selected to show 
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a wide range of scotopic acuity scores. The 
correlations obtained are considered only as 
indications of relationships due to the small 
number of subjects employed. Scotopic acuity 
was measured with the Army Night Vision 
Tester (ANVT-R2X)._ Brightnesses ranged 
from 3.51 to 5.26 log ppl. Mesopic and 
photopic acuities were measured with various 
wall chart tests and targets used in a modi- 
fied Ortho-Rater instrument. Brightness 
levels ranged from 6.03 to 10.60 log ppl. 
The main findings of this study are as follows: 

1. Scotopic acuity scores showed moderate 
positive correlations with the photopic acuity 
scores, and higher correlations with mesopic 
acuity scores, both for the wall chart tests and 
the Ortho-Rater plates. 

2. The Landolt Ring acuity target shows 
higher correlations with the Army Night Vi- 
sion Tester than do the other targets. It is 
not possible to state whether this result is 
due to similarity of design of the Landolt 
Ring and the Night Vision Tester, or to some 
intrinsic factor of the design itself. In fu- 
ture developmental work on a test of night 
vision ability, this target should be included 
as one of the mesopic targets. 

3. High correlations with mesopic acuity 
were obtained in the present study, even 
though the scotopic test was administered to 
the subjects a full year before administration 
of the photopic and mesopic tests. This find- 
ing should raise doubts concerning the claim 
that scotopic visual acuity scores are too un- 
stable to permit their long-term prediction. 

4. As a consequence of 1 and 3 above, the 
practicability of developing a mesopic test of 
night vision ability is indicated. Such a test 
would have the following advantages over a 
scotopic test: shorter adaptation time (hence 
more rapid testing), less expensive and cum- 
bersome equipment, less dependence on light- 
tight testing conditions, and fewer testing 
personnel. 
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The pilot of modern high-speed aircraft is 
faced with many stress situations that were 
unknown to his predecessors, such as the ex- 
treme radial accelerative forces developed 
when an airplane is maneuvered through a 
change in direction, as in turns and pull-outs 
from dives. Popular literature is rich with 
stories of pilots blacking out (a temporary 
loss of vision due to decreased blood supply 
to the eye), suffering sudden displacements 
of the lower intestines, bleeding at the mouth 
and ears, etc. Other than the occurrence of 
blackout and unconsciousness, the latter con- 
comitants of these forces apparently occur 
rarely, if ever, in practice (1). 

Physiologists and medical research spe- 
cialists, together with engineers, have de- 
veloped protective clothing called g-suits in 
an effort to counteract these radial forces. 
While these efforts have been successful in 
elevating the tolerance threshold somewhat, 
techniques have not been developed to com- 
pensate for the tremendous increased effec- 
tive weight of the body under these increased 
accelerative conditions. A person exposed to 
a 5 g accelerative force, by definition, has an 
effective weight equal to five times his normal 
weight. Woods et al. (2) at the Mayo Clinic 
centrifuge have shown that it is impossible 
for a man to rise from his seat under condi- 
tions of 5 g. In addition to the problem of 
general body movement, the increased weight 
also introduces problems in moving the ex- 
tremities, as the arms and legs weigh equiva- 
lently more. This introduces serious prob- 
lems for the pilot when he attempts to reach 


1The research reported in this paper was con- 
ducted at the University of Southern California un- 
der the auspices of ONR contract N6-ori-77 Task 
Order III and constituted the doctoral dissertation 
of the senior author. Dr. Neil D. Warren super- 
vised the research and his kind help and counsel are 
gratefully acknowledged. 


R. C. Wilson 
University of Southern California 
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for and/or manipulate controls under condi- 
tions of radial acceleration. 

While the effect of these radial accelera- 
tive forces on effective body weight is the 
same irrespective of the direction from which 
the force is imposed, markedly different physi- 
ological effects are associated with them. 
When the force is applied along the vertical 
axis of the body from head to seat, blood 
tends to pool in the abdominal cistern and 
the lower extremities. This is the commonly 
experienced positive g and is the type studied 
in this paper. When the direction of force 
application is reversed, blood pools in the 
head, and this is called negative g. When the 
force is applied at right angles to the vertical 
axis of the body it is called transverse g, and 
has generally less serious effects. The toler- 
ance to transverse g is very high (partially 
accounting for experiments on the prone posi- 
tion for high-speed aircraft pilots), next high- 
est for positive g, and low for negative g. 

This research was conducted for the pur- 
pose of evaluating the effects of positive g 
forces on the speed and accuracy of ballistic 
reaching movements of the arm. All of the 
research data were collected on 48 volunteer, 
but paid, Ss on the human centrifuge located 
on the University of Southern California cam- 
pus. Each S had passed a rigorous physical 
examination before being allowed to partici- 
pate in the study. 


Experimental Procedure 


The 48 Ss were randomly divided into four 
groups of 12 each. Each group was subjected 
to three different g conditions: 1 g, 3 g, and 5 g. 
All made a ballistic reaching movement with their 
hand to a target approximately 5’ square at a 
distance of 19” from the starting point. This 
was a Switch on the end of a metal tube which 
projected toward them at shoulder height and in 
the midline of the body. From this point they 
reached at an angle of 35° to the target in each 
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of four positions—up, down, left, and right. The 
target face was at right angles to the path of 
movement for all four target positions. The 
whole target area was well within the maximum 
working area of the arm as described by Barnes 
(3). 

The switch at the starting point closed a cir- 
cuit on a standard timer when S removed his 
hand. Another micro-switch was placed behind 
a rubber diaphragm which served as the backing 
of the target. When S hit the target surface 
with his finger, this switch automatically opened 
the circuit and stopped the timer. Another clock 
in the circuit started when the starting buzzer 
sounded and stopped when he hit the target. It 
was thus possible to derive the following three 
time scores: the time taken to start the move- 
ment, called reaction time in this study, the time 
taken to make the movement, and the total time 
which elapsed from the sounding of the buzzer 
to the completion of the movement. 

The face of the target was covered with a 
sheet of polar coordinate graph paper scribed in 
intervals of 1 tenthinch. The S’s preferred finger 
was covered with a metal cot that terminated in 
a pin-like point. As the target was struck this 
point punctured the polar coordinate target sheet. 
These points indicated the exact location of the 
strikes. The strikes on the target were consid- 
ered from two standpoints—the quadrant of the 
target in which they fell, regardless of the size of 
target center disparity; and the distance from 
the center of the target, direction disregarded 

A number of different types of scores were 
available for comparing S’s performance at the 
different g-levels and target positions such as re- 
action time, movement time, total time, direction 
of error, magnitude of error, and the relation be- 
tween the times and the accuracy of the move- 
ments. 

Before starting the test trials, each S was given 
two indoctrination rides on the centrifuge includ- 
ing a ride at 5 g. If S desired to continue, his 
experimental trials were begun. On the first ex- 
perimental day, each S$ spent about fifteen min- 
utes making movements to the target in the po- 
sition he would encounter on that day. Each S 
was also trained:to detect the difference between 
the warning and the reaction buzzers (differing 
in pitch) and was shown how his responses would 
be evaluated in the experiment. All Ss were in- 
structed to make the movement as quickly and 
accurately as possible, and to strike as near the 
center of the target as they could. During this 
first day’s practice, care was taken to assure that 
S make a ballistic movement (4), and not a mov- 
ing fixation 

Each of the four sub-groups of Ss, 12 Ss in 
each, had different arrangements of target po- 
sition. Within the framework of the total group, 
all positions preceded and followed each other 
an equal number of times. While the target 


order was the same for all subjects in any one 
group, the order of imposed force for members 
of the group was systematically varied. As a 
result each g level and target position preceded 
and followed each other an equal number of 
times. These precautions were taken to avoid 
any experimental error that might result from 
the serial effects of either g or target position. 

Each S$ had two experimental days following 
the first day of practice. The target was placed 
in two of the four positions of each of these days, 
and S made four consecutive reaching movements 
for the target at each of the three positive g 
conditions used in the experimental—1, 3, 5 (1 g 
is normal gravitational force, and does not in- 
volve centrifuge rotation). 

The data of the experiment were 192 move- 
ments (4 each for 48 Ss) made to each target po- 
sition for each of the three different g levels 
Only the target position and the radial force im- 
posed were known to vary systematically. 


Results 


Direction of Error. The observed distribu- 
tion of the responses in four quadrants of the 
target demonstrated striking changes as the g 
level increased, but the nature of the change 
varied with target position. Figure 1 shows 
the number of responses which fell in each 
of the quadrants for each of the four target 
positions at the three g levels. 

An examination of Figure 1 shows that at 
1 g the movements made upward, to the left 
and to the right tended to fall in the upper 
half of the target, and the responses down- 
ward fell about equally in the upper and 
lower halves. The figure also shows that the 
responses tended to fall on the right side of 
the target when it was in the “up” and “down” 
positions, on the left side when in the “left” 
position, and on the right side when in the 
“right” position. As the g level increases, 
however, the responses moved to the lower 
half of the target when in the “up,” “left,” 
and “right” positions and the upper half 
when the target was in the “down” position. 
Similarly, the responses shifted to the right 
half of the target when it was in the “up,” and 
“left” positions and to the left half when in 
the “down” and “right” positions. 

Table 1 shows the results of Chi Square 
tests of the distribution of responses between 
the g levels for each target position. 

Of the 24 values, 16 are significant beyond 
the 1% level of significance. and in all but 
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DOWN 


68 | 106 58 
RIGHT 


43 | 43 | 86 


52 
110 


81 111 


Fic. 1. Frequency of responses in the various 
six instances they are significant beyond the 
5% level. The responses clearly tend to move 
to the nearer and lower quadrants of the tar- 
get as the g level increases. 

Response Accuracy. In all of the four tar- 
get positions the accuracy of the movement 
was severely impaired by the higher accelera- 
tive forces. Table 2 shows the circular errors 
for the various g levels and target positions. 

In all cases the magnitude of the error of 
movement was larger (significant beyond the 
1% level of confidence) at 5 g and 3 g than 
it was at 1 g. The increase between 3 ¢ and 
5 g, however, was not significant for either the 
“down” position or “right” positions. 

The accuracy of movements to the left was 
significantly poorer at all g levels than those 


Wilson 


| 
L 


3 
82 


target quadrants by target position and g level. 


made downward and to the right. It was 
also significantly poorer than upward move- 
ments except at 5 g where no significant dif- 
ference was found between the two. Move- 
ments to the right, upward, and downward 
were not significantly different in their ac- 
curacy at the 1 g level, but both movements 
to the right and down were significantly more 
accurate than reaching upward at the in- 
creased g levels. 

In general, movements to the left were the 
least accurate at all g levels, with movements 
into the other three planes showing no sig- 
nificant difference under normal conditions. 
Movements to the right and. down, however, 


* Forty-seven of the 48 Ss preferred the right hand 
for making this type of movement. 
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Table 1 
Chi Square Values from a Comparison of the Obtained Left-Right and Up-Down Splits in 
Target Strikes at the Various g Levels with Each Other* 


Up 


Down 
u-d l-r u-d 
80.45 13.35 
148.45 1.26 
8.17 


4.06 





6.79 
15.62 


1.70 51.06 
82.70 
3.00 


No. of responses = 192 


Target Position 


Left Right 
u-d l-r u-d l-r 
65.33 3.09 17.97 

107.54 29.44 19.22 

4.95 13.06 0.02 


0.18 
42.64 
36.95 





* Chi? of 3.84 significant at the 5% level of confidence; Chi? of 6.64 significant at the 1% level of confidence. 


are a great deal more accurate than upward 
movements at the increased g levels. Reach- 
ing to the right is somewhat more accurate 
than reaching down at 3 g, but no significant 
difference was found at 1 g or 5 g. 

Movement Time. The time required to 
complete the ballistic movement increased 
markedly as the g level increased for move- 
ments upward and to the left, but was less 
seriously impaired for movements downward 
and to the right. Table 3 shows the move- 
ment times with the target in the four posi- 
tions and for each of the g levels. 

The differences in the movement time are 
significantly higher (beyond the 1% level) 
for each succeeding g condition when the tar- 
get is in the “up” position. The time required 
for the movement is similarly, though not as 
seriously, impaired for movements to the left. 


Table 2 
Means, Standard Deviations, and Standard Error of 
the Means of the Circular Error Scores* 
No. of subjects = 48 


g Level 


lg 3¢ Sx 
Target 


Position MesS.«.D. M 


1.73 
2.23 
2.61 
1.90 


S.D. M 


8.38 3.63 
5.84 2.88 
6.97 2.84 9.66 4.57 
7.15 2.97 7.79 3.82 

* All values presented in this table are given in tenths 
of inches. Each of the 48 scores from which these 
values were computed represented the average error 
score of four responses made at the g level and target 
position indicated. 


S.D. 


9.67 3.84 
6.91 3.72 


Up 4.74 
Down 4.61 
Left 5.73 
Right 4.33 








Movements to the downward direction did not 
show any significant increase in time, and 
movements to the right were not consistently 
impaired. 

Both the movements made downward and to 
the right were faster at the increased g levels 
than those made upward and to the left. 
These differences are all significant beyond 
the 1% level of significance except the down- 
left comparison at 3 g which is only significant 
at the 5% level. The only significant differ- 
ence between the speed of movement at 1 g 
in the various directions was that movements 
to the right were faster (at the 5% level) 
than those made upward. No significant dif- 
ferences were found between the speed of 
reaching to the right or downward. 

Reaction Time. Previous research (5) has 
indicated that simple reaction time to both 
sound and light stimuli increases significantly 


Table 3 
Means, Standard Deviations, and Standard Error of 
the Means of the Movement Times* 
No. of subjec ts = 48 


4 Level 

lg 3g 5g 
Target 
Position 
Up 
Down 
Left 
Right 


S.D. M 


484 
378 
438 
359 


MsSS.«.D. M 
1.35 .295 
1.31 .337 
1.33 .306 
1.28 .323 


S.D 

2.20 .620 
1.31 406 
1.59 .505 
1.35 .397 


1.50 
1.27 
1.38 
1.23 


* All values are given in seconds. Each of the 48 
scores from which these values were computed repre 
sented the total time taken for four response movements 
at the g level and target position indicated. 
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Table 4 


Phi Coefficients Between Circular Error and Movement 
Time for the Target Positions by g Level 


No. of subjects = 48 








Target Position 


———— —— 4 


Right 


— .416** 
— .500** 


Down 


— .416** 
— .390** 


Left 
— .248 
— .248 





—.374%* —.332* —.627°* 


* Significant at the 5% level of confidence. 
** Significant at the 1% level of confidence. 


at increased g levels. The term reaction time 
as used in this study of reaching movements 
should not be interpreted as a measure of the 
maximum speed of reaction, but should be 
considered as a preparation period prior to 
instigating a movement. This period was 
significantly longer (beyond the 1% level) 
for all target positions at 5 g than it was at 
1 g, but did not differ significantly between 
any of the target positions, or correlate sig- 
nificantly with either the accuracy or speed 
of the movement which followed. 

Relation between Movement Speed and Ac- 
curacy. It was consistently found that longer 
movement times were associated with greater 
accuracy. Table 4 shows the Phi coefficients 
derived from the intercorrelation of each S’s 
movement time and accuracy scores at each g 
level and target position. The significance 
of the Phi coefficients were determined through 
converted Chi Square values. 

Inasmuch as speed of movement and 
amount of error were found to be negatively 
related, it must follow that since both speed 
and accuracy were impaired under increased 
g conditions, the accuracy would be even 
further impaired if the same movement time 
were achieved, and vice versa. 


Interpretation 


Direction of Error. The fact that the 
reaching movements tended to terminate in 
different quadrants of the target as the g 
level increased is attributed to two different 
sources. The first of these might be termed 
“experimental error.” 


Under increased g conditions, the first re- 
sponse of the Ss to the target would quite fre- 
quently fall far below the center of the target. 
If the trial which followed was a 1 g trial, 
the first movement was frequently quite high. 
Both of these types of errors on the first 
movement (too low at the increased g condi- 
tions and too high at the normal g condition) 
were often accompanied by exclamations of 
surprise. The first movement at 1 g was fre- 
quently made in response to the pattern of 
kinesthetic cues that had been used for mov- 
ing the arm under the previous atypical weight 
conditions. This introduced a source of re- 
sponse error that is difficult to judge, but if 
recognized as a systematic source of error, 
admits that the error was due to the condi- 
tions of the experiment and does not detract 
from the meaningfulness of the results inso- 
far as the effect of increased g on movement 
is concerned. 

Second, the accumulation of strikes on the 
lower and nearer sections of the target sug- 
gests that two different types of movement 
errors occurred. First, the observance of re- 
sponses on the near side of the target suggests 
that the initially applied force was insufficient 
to carry the arm to the intended termination 
point, and second, the tendency for the strikes 
to accumulate on the lower half of the target 
is attributed to an error in judging the re- 
quired trajectory of movement. Following 
the terminology of Brown et al. (6), the first 
of these has been called the “negative inertia 
error,” and the second has been termed the 
“error of downward tendency.” As a con- 
sequence, one would expect the strikes to fall 
in the lower half of the target in the “up” 
position as both errors are acting in the same 
direction. In addition, since the types of 
errors are additive in this position, one would 
anticipate movements to this position to be 
the least accurate of all. The results verify 
this deduction. Similarly, the response to the 
target in the left and right positions would be 
expected to accumulate on the lower and near 
section. This.is what occurred. With the 
target in the down position, the errors tend 
to offset each other, the negative inertia 
error tending to make them fall on the upper 
half of the target, and the error of downward 
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tendency tending to make them fall on the 
lower half of the target. The results have 
shown that the responses fell on the upper 
half at 5 g indicating that the negative inertia 
error was the more predominant of the two. 

Response Accuracy. The increase in the 
error of the movement is attributed to the in- 
adequacy of the normal kinesthetic cues under 
the increased g conditions. Possibilities of 
reduced visual acuity at the 3 and 5 g con- 

‘ditions are highly unlikely in view of previous 
research findings on perceptual speed ability 
at these same g levels (7) and the fact that 
no S ever reported gray-out, the preliminary 
symptoms of blackout. 

Movement Time. The increase in move- 
ment time is attributed to the failure of the 
Ss to throw the arm with sufficient force to 
compensate for its increased effective weight. 
Despite the fact that movements, made at 
the increased g levels were in the main shorter 
(responses falling on the near side of the 
target) they took longer. The difference in 
time is hardly within that which might include 
a shift from a ballistic to a moving fixation 
movement, and the error pattern reflects no 
such alteration in method of arm movement. 

Reaction Time. ‘The increase in reaction 
time, as defined in this study, is attributed to 
an increased cogitation period before start- 
ing the reaching movement. After the first 
movement at increased g, Ss were immediately 
aware of the fact that this was a different 
situation, calling for a different arm thrust. 
The increase in time between the warning 
buzzer and the start of the movement is con- 
sidered an increase in the readiness period 
taken by the Ss to get better “set” for the 
ensuing movement. 


Summary 


From the results of this research, certain 
conclusions about the effect of increased posi- 
tive radial acceleration on reaching move- 
ments may be advanced. 

1. Both the speed and accuracy of reach- 
ing movements at increased g levels are seri- 
ously impaired, the degree of impairment 
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being roughly equivalent to the amount of 
force imposed. 

2. The kinesthetic cues governing the 
thrust of the arm under normal circumstances 
are inadequate to maintain similar accuracy 
or speed under radial accelerative conditions. 

3. Due to the increased weight of the arm 
and the inadequacy of the normal kinesthetic 
cues, two types of errors are found, one being 
the negative inertia error and the other the 
error of downward tendency. 

4. The most favorable location of controls 
for the pilot of high-speed aircraft, both from 
the standpoint of speed and accuracy, is to 
the side of the pilot’s preferred hand and be- 
low its normal resting point. 

5. Emergency controls that might have to 
be manipulated under conditions of increased 
positive radial acceleration should be no 
smaller than two itches in diameter if a push- 
ing motion is required. 


Received June 23, 1952. 
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Applied Psychology in Action 


Editor’s Note: An announcement of this 
new feature of J. appl. Psychol., including a 
plea for psychologists in business and indus- 
try to send in suitable material was sent to 
about 40 psychologists on the firing line early 
in February, 1953. As of the beginning of 
April, not a single response had been received. 


As noted in the April issue, this new feature 
will be continued only long enough to de- 
termine whether or not our readers desire 
such a section and whether or not psycholo- 
gists will take the time and trouble to submit 
suitable copy. 


Job Supervision of Young Workers 


The following is extracted from a Report 
of the Technical Committee on Supervision 
of Young Workers, Bureau of Labor Stand- 
ards, U. S. Department of Labor, February, 
1953, composed of: Chairman, Mrs. Margaret 
F. Ackroyd, Chief, Division of Women and 
Children, Department of Labor, Providence, 
Rhode Island; Fanny G. Buss, Standard Oil 
Company, Cleveland, Ohio; Mrs. Mary 
Cooper, Hutzler Brothers Restaurant, Balti- 
more, Maryland; Jane F. Culbert, Vocational 
Advisory Service, New York City; Gilbert 
David, The Prudential Insurance Company, 
Newark, New Jersey; James Forster, DeKalb 
Agricultural Association, Inc., DeKalb, IIli- 
nois; Harry Gladstine, The Washington Post, 
Washington, D. C.; Dr. Dale Harris, Insti- 
tute of Child Welfare, University of Minne- 
sota; Mrs. Bernice Heffner, American Federa- 
tion of Government Employees, Washington, 
D. C.; Kathryn-Lee Keep, Department of 
Labor and Industry, Erie, Pennsylvania; R. 
Bruce Neill, James Monroe High School, 
Fredericksburg, Virginia; Clyde L. Schwy- 
hart, Caterpillar Tractor Company, Peoria, 
Illinois; Thomas E. Walsh, Amalgamated 
Clothing Workers of America, Troy, New 
York; Benjamin C. Willis, Superintendent of 
Schools, Buffalo, New York; and Mrs. Ger- 
trude Folks Zimand, National Child Labor 
Committee, New York City. 

“What Should the Supervisor Know About 
Youth? The Committee believed that the 
core of the task of getting better supervision 
of young workers is to help supervisors of 
youth to be more interested in and better 
understand the basic characteristics of youth 
—their capabilities, their problems, their at- 
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titudes, and their needs. The responsibilities 
and the attitudes of work supervisors neces- 
sarily center largely about a concept of work- 
ers as adults. Nine-tenths of the Nation’s 
workforce is indeed past 20. Yet almost 
every industry and business has at least a 
small component of young beginners. That 
youth are not yet adult is a truism, but too 
often not fully understood by the man or 
woman whose responsibility it is to help 
youth give good work performance and grow 
up to be good workers. 

In attempting to define the characteristics 
of youth of significance to a work supervisor, 
attention was focused upon youth of about 
14 to 18 years of age. The Committee be- 
lieved, however, that a description of this 
midadolescent group would be useful in un- 
derstanding older youth on the job as well. 
The Committee was exceedingly grateful to 
its member, Dr. Dale Harris, for preparing 
in advance of the meeting an analysis of the 
characteristics of youth in their midadoles- 
cence with special reference to those charac- 
teristics likely to be significant in job situa- 
tions. Bringing the practical job experience 
of various Committee members to bear on Dr. 
Harris’ contribution, the Committee developed 
the following description: 

Youth—A Period of Adjustment. The su- 
pervisor must first of all realize that adoles- 
cent boys and girls are in transition from 
childhood to adulthood, and that this stage 
in a person’s development may be a difficult 
period of personal adjustment. This transi- 
tional stage has no precise age limits, but is 
defined by psychologists as beginning at 
roughly 12 to 14 years and continuing to 21 
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or 22 years. The midadolescent period of 
about 14 to 18 years of age is normally the 
period of greatest stress. 

The major areas of adjustment that are the 
primary concerns of teen-agers are usually 
considered to center about four factors: (1) 
how to be attractive to the opposite sex, (2) 
problems of family relations which result 
from their attempts to emancipate themselves 
from parental control, (3) for those still in 
school, anxiety over school achievements, (4) 
concern with vocational plans—though this 
may often be vague and unrealistic. 

Of great significance to those who super- 
vise the early work experience of adolescents 
is the youth’s concern to be considered some- 
body, a person of importance to himself and 
others, with a place in the world and a con- 
tribution to make. 

Of equal significance to supervisors is 
youth’s own insecurity about the emerging 
responsibilities and challenges of adulthood 
and how to act to realize them. They do not 
like to admit these insecurities, but they are 
nevertheless there. Young people therefore 
reach out for security which in large part 


they attempt to find by tying closely to a 


group of their own age. They seek the ap- 
proval of that group, and conformance to its 
standards becomes very important to them. 

In the adolescent’s desire to be grown-up, 
he sometimes has difficulty in accepting the 
authority of adults. This adolescent ‘revolt’ 
expresses itself in various ways—including 
the display of immoderate behavior, language 
or dress. 

Basic Characteristics. There is of course, 
a tremendous range of differences among in- 
dividuals at adolescence as at any age. No 
two are indeed alike. However, the basic 
characteristics of the adolescent stage of de- 
velopment which the supervisor of young 
workers needs to be aware of, can in general 
be described as follows: 

Physical Maturity. Most girls will be 
physically matured by the age of 14; a good 
many boys are still immature at this age. 
Consequently girls are much more likely to 
appear mature and socially poised than boys 
their own age. 
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Strength is closely related to physical ma- 
turity. Boys gain 50 per cent in actual 
muscle volume during adolescence, girls much 
less. Sexually immature individuals at this 
age will lack considerably in strength and 
endurance. There are great differences in 
strength between physically mature and im- 
mature boys of the same ages. 

Adolescents can mobilize much energy on 
demand, but not all youth are able to main- 
tain sustained output. Furthermore, growth 
in muscle volume may lag behind growth in 
stature; a youth may not be as strong as he 
first appears. 

Many adolescents of this age have yet to 
learn how to achieve a balance between physi- 
cal needs for rest on the one hand, and social 
interest and needs on the other. 

Physical health is good; the period is char- 
acterized by little illness. 

Basic motor skills, such as speed of move- 
ment, reaction time, and coordination are 
fully developed although not necessarily fully 
trained. 

Intellectual Maturity. Intellectual stature 
has just about been reached; measurable in- 
crements of intelligence after fifteen are much 
less significant than those which occur from 
ages ten to fifteen. Many older adults fail 
to recognize that the average adolescent of 
sixteen and seventeen has achieved sharpness 
of intellect and a heightened readiness to 
learn. 

I.ven though he may be ‘bright,’ the adoles- 
cent is limited in ‘judgment.’ Though he 
lacks experience, he resents being talked down 
to and being considered unable to solve prob- 
lems. 

The adolescent’s ability to think abstractly 
is well developed, which leads him to seek 
reasons based on principle. This sometimes 
makes him appear argumentative. 

Adolescents are able to evaluate their own 
behavior, and actually they engage in a great 
deal of self-criticism. They are often quite 
sensitive to blame, though they may not seem- 
ingly admit failure when criticized. They 
may be easily discouraged. 

Many adolescents exhibit a great deal of 
intellectual and emotional ‘questing’—-a vague 
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longing for something unknown. This makes 
them receptive to emotional appeals to loyalty, 
integrity, self-sacrifice. 

Interests and Attitudes. The day-to-day 
behavior of the midadolescent will show con- 
siderable vacillation between the developing 
interests of maturity and the interests of the 
younger child, though the adolescent himself 
frequently will reject the less mature interests 
after returning to them temporarily. Fre- 
quently adults see this as unpredictability and 
unreliability. 

Many adolescents are characterized by con- 
siderable idealism and a sense of altruism, 
and at the same time by snobbishness and a 
feeling of superiority. 

The critical capacities of the adolescent 
may extend to others, so that he appears to 
be highly intolerant. Combined with ideal- 
ism, this characteristic leads him to seek per- 
fection in adults, and he may feel let down 
when they fail to measure up to his expecta- 
tions. These attitudes may carry over into 
his relationships with his work supervisor. 

Adolescents frequently have a strong de- 
sire to do well, and to get ahead. Although 


Applied Psychology in Action 


they are somewhat vague about specific goals 
in life they want ‘to know where they are 
going.’ 

Social Behavior. Much of the social be- 
havior of adolescents is characterized by ap- 
parent contradictions which upon closer in- 
vestigation are found to be more apparent 
than real. 

There is a strong desire to be treated as in- 
dividuals; there is also a strong desire to con- 
form to the standards set by young people 
their own age. 

On the other hand, there may be much 
deliberate imitation of the attitudes and ac- 
tions of adult associates, especially of those 
they admire. 

There is a strong need to be independent; 
there is also a strong need to be dependent. 

Language is ostentatiously colorful, slangy 
and emotional. Adolescents’ use of profanity 
or obscenities may actually be an attempt to 
appear sophisticated and mature. 

The adolescent is typically group-minded; 
he wants to ‘belong,’ and will respond readily 
to the idea of teamwork.” 


Personnel Psychology in a Steel Company 


The work of Personnel Psychologist George 
M. Hill at the Armco Steel Corporation, Mid- 
dletown, Ohio, was featured in the February 


19, 1953 issue of The Iron Age, pp. 61-62. 
The following interesting chart was included 
in the article: 
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Book Reviews 


Kephart, Newell C. The employment inter- 
view in industry. New York, McGraw- 
Hill, 1952. Pp. 277. $4.50. 


The book covers the main items regarding 
content and method of the employment in- 
terview and a lot of other material about em- 
ployment procedures. It is on this point 
that the author might be criticized, namely 
including so much other material in a book 
entitled Employment Interview. It would 
seem from the discussion that the interviewer 
gives tests and visual examinations, diagnoses 
mental maladjustments and interprets all 
available predictors of the criterion. To be 
sure this is all pertinent to the process of 
hiring and the discussion of these aspects is 
sound. Actually there is scarcely enough ma- 
terial on the face-to-face aspect of the em- 
ployment interview to make a respectable 
book and the author presumably did what any 
of us would have done under the circum- 
stances. 

The initial chapter sets the place of the in- 
terview with reference to usual employment 
procedures. Then follow chapters about the 
content of the interview and some “how to” 
aspects. The first of these involves knowl- 
edge of the job with due reference to the Dic- 
tionary of Occupational Titles and various 
blanks devised by the War Manpower Com- 
mission. It is helpful to have some of this 
WMC material in a handy form. The next 
chapter deals with evaluating past experi- 
ence by means of job families, avocations, and 
Volume Four of the Dictionary. Tests of 
intelligence and of motor ability are con- 
sidered. The author cautions against in- 
ferring intelligence from the interview alone 
but does indicate some things such as vocabu- 
lary or sentence structure manifested in the 
interview that might give some indirect evi- 
dence. 

A chapter on personality includes specific 
items of behavior that might be observed dur- 
ing an interview. It also discusses clinical 
symptoms and syndromes. An _ interesting 
suggestion is attempting to find a job for an 
applicant with serious personality deviations 
who may, nevertheless, adjust to some kind 


of work. This is a commendable acceptance 
of industry’s social responsibility. In con- 
nection with physical demands of the job 
there is due emphasis on vocational possibili- 
ties for persons with physical disabilities. 
Emotional maturity is mentioned as especially 
important for leadership jobs and some spe- 
cific interview questions are suggested which 
might bring out emotional maturity. 

The last two chapters deal more~ specifi- 
cally with the mechanics of the interview— 
what the reader would anticipate from the 
title of the book. One considers preliminary 
preparation—the actual environment of the 
interview and the application forms. There 
is a tabulation of items involved in a consid- 
erable number of application forms which 
might help someone in devising a form of his 
own. With reference to the actual conduct 
of the interview there is emphasis on the 
avoidance of bias and of stereotyped methods. 
The patterned interview is recommended on 
the basis of some experimental studies of re- 
liability and validity. The over-all conclusion 
appears to be that the interview is needed to 
supplement tests because the latter do not get 
at everything needed in the job and do not 
have perfect validity. The reviewer is dis- 
posed to add that as we perfect objective per- 
sonality tests the importance of the interview 
may ultimately decrease. A final topic is the 
importance of giving the applicant adequate 
information about such things as_ hazards, 
working conditions and possibilities in the 
job. This aspect might very profitably re- 
ceive more stress in the discussion. 

The book is moderately well documented 
with references at the end of each chapter to 
a few pertinent experimental studies. The 
general level is fairly elementary except for 
an occasional mention of something like mul- 
tiple correlation and the book evidently is 
designed for the person without much tech- 
nical background. There are a lot of wise 
cautions for an interviewer, such as not being 
misled by a good talker. There are also a lot 
of hints as to what kind of questions to ask 
in order to bring out indirectly some aspect 
of personality. Finally, there is a wholesome 
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emphasis on the social aspects of the employ- 
ment program and of industry’s responsibility 
for the over-all adjustment of the worker. 


Harold E. Burtt 
Ohio State University 


Wechsler, David. The range of human ca- 
pacities. Second Edition. Baltimore: 
Williams and Wilkins Co., 1952. Pp. 190. 
$4.00. 


This is a revision of Wechsler’s 1935 book 
of the same title. New chapters on produc- 
tive operations and span of life have been 
added and the chapter on the effect of age 
has been enlarged. There has been minor 
rewriting in many parts of the book and it 
has been completely reset. 

The purpose of the book is still “to show 
that human variability, when compared to 
that of other phenomena in nature is ex- 
tremely limited, and that the differences 
which separate human beings from one an- 
other .. . are far smaller than is ordinarily 
supposed.” 

In pursuit of this objective, Wechsler 
gathers data concerning human capacities 
(defined to include such diverse measures as 
temperature, height, reaction time and _in- 
telligence) and compares the score of the 
lowest person with the highest person within 
the normal population. Normal population 
is arbitrarily defined as excluding one-tenth 
of one per cent of the total population at each 
extreme. The comparison effected is in terms 
of the range ratio which is simply the high- 
est score or value divided by the lowest. 
Wechsler notes that these range ratios are 
small (i.e., less than 5) and asserts that they 
are in the nature of natural constants. A 
hierarchy of range ratios is postulated ex- 
tending from about 1.30: 1 in the case of 
linear traits (such as stature and arm length) 
to about 2.5:1 in the case of perceptual and 
intellectual abilities. It is implied that the 
“real” upper limit is probably the growth 
constant e (2.7182) and an attempt is made 
to show that the orderly hierarchy of ratios 
is a function of the number of factors in- 
volved in the various human capacities. 

In the new chapter on productive opera- 
tions, data are reviewed on employee pro- 
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ductivity and it is concluded that a ratio of 
2.00:1 expresses the difference between the 
best and poorest workers. This is interpreted 
as indicating that one cannot expect much 
from selection techniques and need not be 
concerned about uniform pay vs. sliding pay 
scales. 

The chapters on Length of Life, Exceptions, 
and the Burden of Age, while interesting, are 
extraneous to the major theme. For example, 
expected life span gives very large range 
ratios at whatever period of life the ratios 
are computed and Wechsler concludes that 
life span is either not a capacity or that the 
data are badly contaminated. 

In the chapter on Genius and Deficiency 
Wechsler embraces the theory of critical dif- 
ferences to explain both ends of the con- 
tinuum. After a given quantitative change, 
he asserts, qualitative distinctions appear. 
The ability to see new relationships might be 
such a change at the upper end of the scale. 
This qualitative change, it is maintained, ac- 
counts for differences which greatly exceed 
the “mere 50 IQ points” which separate the 
genius and the idiot from the average. In his 
last chapter on the Meaning of Differences, 
however, Wechsler apparently forsakes this 
line of thinking and returns to the refrain that 
if the range ratios yield small numbers, then 
the differences which separate men are incon- 
sequential. 

An appendix on mental measurement and 
one containing his basic data complete the 
book. 

To this reviewer, the book suffers from three 
major confusions. First, it seems obvious 
from the frequent reference to social signifi- 
cance, democracy, rule by the elite, etc. that 
Wechsler feels that to believe in democracy 
one must demonstrate that all people are 
really equal in everything from body tem- 
perature to test scores. This, of course, is 
a confusion of value judgments with descrip- 
tive physical and psychological statements. 
It is perhaps a common confusion but should 
be deplored all the more for that fact. 
Second, while Wechsler knows the rules for 
measurement and the necessary prerequisites 
for making meaningful ratios, he apparently 
does not apply these rules to all of his data 
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and reports range ratios for Mental Ages, 
1Q’s and number of items correct in a vocabu- 
lary test. It is obvious that these ratios have 
no significance since they do not have known, 
meaningful zero points and equal units. 
Third, while Wechsler recognizes that words 
like “small” and “large” are judgmental words 
whose meaning is not clear without a set of 
references, he persists in saying that because 
range ratios can be expressed in “small num- 
bers” (arbitrarily defined), they are “small.” 
And, since they are “small,” the differences 
between people are ‘‘small’” and this has great 
social significance. In other words, “small” 
is defined in one context and applied in an- 
other where it carries added meanings. 

The work is further marred by misinterpre- 
tation of the data and a running series of 
errors. In the chapter on productive opera- 
tions, for example, ten range ratios are intro- 
duced as evidence: 1.73, 2.00, 2.04, 2.10, 2.30, 
2.53, 2.55, 2.57, 2.83, 3.00. From this array 
it is concluded that the range of productivity 
is “. . . not more than 2.5:1 and generally 
more nearly 2.0:1” (!). No evidence is ever 


given for the repeated statement that e is the 


probable upper limit of the ratio. Reference 
is made to figures and tables which disagree 
with the text (e.g., Figure 5, Table 7); a 
significant claim is made about modes in the 
data, one of which is nonexistent, etc. The 
invitation to check the results by recalculat- 
ing the data in the appendix is not reassuring. 
In a cursory examination of these data the 
reviewer found fifteen cases of considerable 
error. Many of these errors also appear else- 
where in the book. Either the data have been 
misprinted, the original range ratios miscal- 
culated, or both. In two cases the data are 
patently impossible. 

Wechsler’s basic notion of the range ratio 
offers interesting and intriguing research ideas 
when confined to the kinds of data to which 
it legitimately applies. In the present con- 
text its potential value appears to be buried 
in a host of confusions and _irrelevancies. 
There appears to be no more need for the 
second edition of this book than there was 
for the first edition. 

James J. Jenkins 

University of Minnesota 


Division of Occupational Analysis, United 
States Employment Service. Dictionary 
of occupational titles, second edition. 
Washington: United States Government 
Printing Office, 1949. Volume I, Defini- 
tions of Titles, Pp. xxviii + 1518, $4.00. 
Volume II, Occupational Classification and 
Industry Index, Pp. xxvi + 743, $2.50. 


Users of the DOT should welcome the 
Second Edition because it provides them with 
more occupational information in more usable 
form than did the early edition. Volume I 
contains the job definitions including those 
from the old Part I, the various supplements, 
and additional new definitions. The appen- 
dices from the original edition (Glossary; 
Index of Commodities to assist in classifying 
Sales Personnel; Occupational Titles Ar- 
ranged by Industry) have been moved to 
Volume II of the Second Edition. Other 
readily apparent changes are the introduction 
of a double alphabetic scheme of presenting 
definitions and a considerable simplification 
of the reference techniques. The main alpha- 
betic listing presents every job and occupa- 
tional title by straight letter alphabetization. 
This is a desirable change over the former 
word alphabetic listing which was more trou- 
blesome to users because of the 
compound and multi-word titles. For ex- 
ample, in the original edition CELLAR 
WORKER preceded CELLARMAN, while 
in the new edition the order is reversed. 
Within the main listing are indented sub- 
listings of job definitions most closely related 
to the base definition; thus, users are saved 
the time of locating these definitions through- 
out the volume and can more readily compare 
the different definitions. The variety of 
reference phrases found in the first edition are 
now reduced to the words “see” and “see 
under.” Teachers and others providing in- 
struction in DOT usage will join the user in 
approving these changes. 

Changes not immediately apparent have 
also been effected. Coverage has been ex- 
panded within the professional occupations 
as well as within several industrial categories. 
Codes now accompany all job definitions 
previously referred to classification titles. 
Four of the so-called grouping title defini- 


numerous 
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tions have been eliminated and type-of-work 
classifications introduced for many laboring 
jobs. Such changes, coupled with the double 
alphabetic listings, mean that thousands of 
additional jobs are now readily codable, which 
is a marked contrast to the cumbersome mul- 
tiple reference processes required with the 
first edition. 

Glimpses of other possible improvements 
are found by comparing the Second Edition 
with the former publications for such classifi- 
cations as CHEMICAL ENGINEER, ELEC- 
TRICAL ENGINEER, and CHECKER 
(clerical) III. Redundancy, overlapping, and 
repetitious classifications have been substan- 
tially eliminated with no significant loss of 
occupational information. It is regrettable 
indeed that publication was not delayed until 
a host of similarly needed changes were made 
—also until the remaining grouping title defi- 
nitions, so frustrating to users, were elimi- 
nated. 

A serious complaint against the original 
edition was the amount of training time re- 
quired to achieve proficiency in its use. 
Here, again, the Second Edition is an im- 
provement—experience having already shown 
that training time is about one third less. 

The general format of the Occupational 
Classification in Volume II remains essentially 
the same with the different titles being readily 
identifiable so that users may locate those 
with definitions in Volume I directly. Users 
will be pleased to note the elimination of the 
LABOR, PROCESS classifications. 

This reviewer strongly feels that the kinds 
of improvements found in the Second Edition 
should have been extended throughout many 
additional occupational areas. 

Alan M. Kershner 


Personnel Research Center, Inc., 
Arlington, Va. 


Prasad, Kali. 


Fatigue and efficiency in tex- 
tile industry. Lucknow, India: Univ. Luck- 


now Press, 1950. Pp. iii + 34. 
or 2s. 3d. 


This is one report in a continuing series of 
research studies begun in 1947 in the Swa- 
deshi Cotton Mills, India. Four operations 


Rs. 1/8 
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were studied in detail. Output data were 
analyzed by hours, days, months, and shifts. 
Psychophysical tests were administered to 
some employees, and a questionnaire was ad- 
ministered to a sample. 

It is difficult to evaluate this book properly 
because of cultural differences in the degree 
of psychological sophistication between India 
and the United States. Further, this book is 
not the final report of the entire series of 
studies. 

From the viewpoint of United States psy- 
chology, this study is weak in several im- 
portant respects. Fatigue is defined as “a 
condition . . . caused by activity in which the 
output produced by that activity tends to be 
rather poor, and the degree of fatigue tends 
to vary directly with the poverty of output.” 
It appears to this reviewer that this defini- 
tion also covers “monotony,” for example. 
It might be better to concentrate on varia- 
tions in output, and forget the fatigue. 

The mill had 9,404 employees, but most 
data refer to extremely small samples, i.e., 4, 
8, 33, etc. The “criterial level” or “ideal per- 
formance” of each worker was based on a 
one-half hour sample of his output. These 
samples are not only too small, but also sub- 
ject to disturbing variables such as the ‘“Haw- 
thorne effect.” It is not always clear whether 
data were “experimentally” collected, or taken 
from routine records. The significance of 
much of the data is not clear. 

Since Indian psychology and economy are 
both rather new, it is possible that this is an 
important study in India. The present book 
is not particularly valuable to Americans. 
Perhaps the final report of the whole study 
will be useful. 

Harold F. Rothe 


American Hospital Supply Corporation, 
Chicago, Illinois 


Lauer, A. R. Learning to drive safely. Min- 
neapolis: Burgess Publishing Co., 1949. 
Pp. 145. $2.25. 


This manual conveniently and precisely 
presents a_ well-conceived driver training 
course for the course administrator, the driv- 
ing instructor, and the student. In the words 
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of the author: “This manual is the result of 
twenty years’ study of drivers’ aptitudes, 
habits, abilities, and disabilities, in addition 
to ten years’ experience in teaching drivers 
and instructors of driving at Iowa State Col- 
lege. Every step outlined has been carefully 
tested and evaluated for difficulty, order of 
presentation, and usefulness. . . .” 

Duties and responsibilities of all connected 
with the course from administrator to student 
are specifically prescribed, including such de- 
tails as solicitation and payment of fees and 
principles and procedures to protect students 
and equipment. The course material itself 
consists of ten basic units of instruction, 
which may be covered in ten or more lessons. 
Each unit contains an introduction directed 
to the student, an outline of skills to be mas- 
tered, a few reference readings, a list of ques- 
tions, and a student’s report form. A valu- 
able appendix contains specific suggestions 


for handling classes, suggested administra- 


tive forms, psychophysical and psychological 
tests, a list of equipment needed, and a list 
of films and visual training aids. 

While the manual is directed to a non- 
professional audience, there are two items of 
particular interest to psychologists. First, 
the author considers the development of 
proper attitudes toward driving as a most 
important part of the course. He stresses the 
need for the course to begin with reading 
and classroom discussion and exercises in 
order ‘‘to broaden their interest in good driv- 
ing and the philosophy of safety education.” 
Secondly, the appendix does contain a short 
section on the interpretation and use of psy- 
chophysical and psychological tests. One 
hopes that the users of the tests contained in 
the manual will seek advice from persons 
qualified to interpret tests results. The re- 
viewer finds a statement encouraging this ac- 
tion conspicuously lacking. 

In summation, the manual should be a 
tremendous aid to the school administrator 
planning to establish a driver training course 
or seeking to improve an existing course. Its 
primary value to psychologists, as well as to 
all citizens, will not come from reading the 
manual, but will accrue from the lowering of 
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the accident rate as more and better formal 
training courses are established. 
Stanley E. Jacobs 
Department of the Army, 
Washington, D. C. 


Shostrom, Everett, and Brammer, Lawrence. 
The dynamics of the counseling process. 
New York: McGraw-Hill Co., 1952. Pp. 
xvi + 213. $3.50. 


A book designed to meet the present needs 
of counseling should certainly deal with, 
among other things, problems of definition, 
real or apparently conflicting theories and 
practices, foundations in general psychology, 
especially learning and motivation, and the 
role of counseling in its various settings. This 
book was apparently so designed. Unfor- 
tunately, it falls short of the mark at almost 
every point. 

Its core is a description of the “self-adjus- 
tive’ approach which represents one more at- 
tempt to synthesize the Minnesota and Chi- 
cago positions. It is Rogerian in its major 
features but tries to make room for testing 
and informational procedures. It is defined 
as “. . . counseling which assists the client 
to become more self-directive and _ self-re- 
sponsible” (p. 2). While it may be more 
appealing to state a definition in terms of 
goals rather than in terms of operations, it 
is probably less scientifically useful. For- 
tunately, this does not affect the discussion of 
actual methods which is fairly well done and 
describes procedures that have been followed 
for some time in the better counseling centers. 

Concerning the synthesis of opposing points 
of view, the authors furnish their own best 
criticism: “It is this middle-of-the-road stand, 
taken by so many counselors, which has 
created more confusion than clarification in 
counseling methodology” (p. 4). For despite 
their diagrams and denials, they are, if not 
on a continuum (which probably does not 
exist), certainly in between. 

In the process, they tilt at the usual wind- 
mill, directive counseling, and bandy about 
the usual tired, emotionally-toned, invidious 
comparisons in which the good (i.e., Rogerian, 
permissive, or self-adjustive methodology) is 








244 


characterized as “democratic” as in the fol- 
lowing: “It would appear that the basic as- 
sumptions of democracy and those of client- 
centered therapy are one and the same. . .” 
(p. 10). Their béte noire is Williamson’s 
fifteen-year-old text on counseling. 

The authors’ attention to a systematic 
basis on learning theory for their method is 
commendable, but the reviewer was puzzled 
by their use of John Dewey’s 1933 book as 
the principal (and almost exclusive) core of 
the presentation. The names of Hull, Guth- 
rie, and Tolman do not appear, and regretta- 
bly little is made of the cited works of Dol- 
lard,and Miller, Mowrer, and Shoben. 

The Stanford Guidance Study is presented 
as an example of research on counseling. In 
it, “Feeling tone . . . was the criterion used to 
evaluate the effectiveness of counseling” (p. 
41). Their conclusions are based on differ- 
ences between ratings of such feelings; differ- 
ences are described as significant, but no sta- 
tistics are presented to document this. 

Methodologically, they have accepted 
several doubtfully valid notions. For ex- 


ample, they say, “It is assumed that clients 
are capable of selecting their own tests” (p. 


74). Also, they suggest that, “Perhaps if 
counselors would concentrate less on the limi- 
tations of students and more on the limita- 
tions of test data, the quality of guidance 
would improve” (p. 29). Perhaps! But this 
is a rather naive criticism of one of the best 
developed aspects of counseling. And it is 
difficult to see how the authors could have 
failed to see the implications of their state- 
ment that, “The only (italics added) indica- 
tors that anxiety has been reduced are the 
client’s feelings expressed toward himself and 
the counseling services” (p. 151). 

While the foreword makes much of the fact 
that this book conceives of counseling as an 
integral part of education, the actual discus- 
sion of the role of counseling in colleges and 
universities is limited to six pages of quite 
superficial description of needs. The train- 
ing of counselors is dealt with in fourteen 
lines which emphasize the value of electrical 
recordings; these presumably give the trainee 
a knowledge of rather than a knowledge about 
counseling. 
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This book left the reviewer with one over- 
riding impression: that it is undigested. The 
authors are obviously enthusiastic and am- 
bitious; they are aware of the major needs of 
counseling; they have written well and made 
their points forcefully. Yet the total prod- 
uct is, regrettably, unsatisfactory. 

John W. Gustad 


* University Counseling Center, 
University of Maryland 


Guetzkow, H. (Ed.). 
men. Pittsburgh: 
Press, 1951. 


Groups, leadership and 

Carnegie University 
Pp. ix, 293. $5.00. 

This book presents progress reports of five 
years (1945-1950) of contract research in 
Human Relations sponsored by the Office of 
Naval Research. These twenty reports are 
revisions of papers given at a mutual stock- 
taking conference which was held at Dear- 
born, Michigan in September, 1950. Psy- 
chologists predominate among the contribu- 
tors which also include sociologists, political 
scientists, economists, and journalists, all of 
whom were members of the research teams 
involved in the undertaking. 

The book’ is divided into the three main 
sections suggested by the title. About half 
of the space and total number of reports are 
included in the first section, which deals with 
research on the behavior of groups. R. B. 
Cattell introduces this section with the formu- 
lation of methodology and basic concepts. 
Following this discussion are a number of re- 
ports by leading members of several Univer- 
sity of Michigan research centers. Among 
the subjects treated are components of group 
morale, the effects of communication on. non- 
conformists, workers’ loyalties to union and 
management, and factors making for group 
productivity. The section is concluded with 
Margaret Mead's paper on research in con- 
temporary cultures. 

The second section deals with problems of 
leadership. Topics discussed include: the in- 
fluence of the group in determining leader- 
ship style, the relation of the follower’s per- 
sonality to the leader, and leadership effec- 
tiveness at the production level. 

In the final section, which is concerned with 
individual behavior, the psychological reader 
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is brought back to terra firma. New light is 
cast on traditional problems of measuring 
motivation, the relationship of verbal be- 
havior to the reasoning process, and the ad- 
vantages of neuropsychiatric screening. 

Some technical detail has been omitted 
from the original reports in order to make 
them suitable for a wider readership. How- 
ever, references at the end of most chapters 
have been supplied to aid the more curious 
social scientists in following up these over- 
views. . 
A service is rendered the reader by John G. 
Darley who has contributed introductory and 
concluding chapters designed to give perspec- 
tive and integration to an assortment of com- 
petent but somewhat discontinuous reports 
of on-going research. The reader who is more 
concerned with practical military application 
is accommodated by a discussion at the end 
of the book, and those interested in securing 
contract subsidy for their projects will find 
the appendix helpful. 

In general, the content of the book is more 
of a prologue to a new social psychology than 
a report of substantial achievement. The 


atmosphere is one of more problems raised 
than solved, and the predominant theme is 


“further research needed.” But there is a 
healthy respect for the canons of scientific 
method by the seasoned researchers who have 
contributed to this volume. Although the 
reports deal with path-finding in new terri- 
tories, the projects generally involve prob- 
lems that are reduced to testable hypotheses. 
The generalizations are for the most part 
tentative and limited to the data actually in- 
volved, with a notable absence of intuitive and 
sweeping conclusions. Nevertheless, the re- 
viewer concurs with Darley in the expressed 
need for more synthesis and higher order gen- 
eralization, since the visions gained by this 
exploratory work may tend to be obscured 
by the trees. Another source of uneasiness 
also made explicit by Darley is the insuf- 
ficient consideration of the role of abilities 
and interests as determinants of group be- 
havior. Since psychology has had consid- 
erable success in these areas, even a prologue 
may benefit from the past. 

This volume is a useful source of supple- 
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mentary reading for students in social psy- 
chology, and is of particular interest to the 
multitude of social scientists now in the em- 
ploy of various Armed Forces Human Re- 
sources programs. It has a wider appeal than 
most technical publications of in-service mili- 
tary groups, for the emphasis is on basic re- 
search which the Navy has so far-sightedly 
underwritten. Also, it is a good example of 
what may emerge from the large-scale insti- 
tutional research which has become another 
sign of our times. 
Abraham S. Levine 


Bureau of Naval Personnel, 
Washington, D. C 
Curran, C. A. Counseling in Catholic life 
and education. New York: The Macmillan 
Co., 1952. Pp. 462. $4.50. 


This book is a new approach to counseling 
in a number of ways. It is new in combining 
an accurate knowledge of modern counseling 
techniques as they have developed in the 
fields of psychology and education in America 
with the Thomistic and Aristotelian concepts 
of the virtues. In addition, it definitely re- 
lates religion and counseling together. 

In its technical presentation, this book 
clearly distinguishes counseling from guid- 
ance and so opens the way for the use of both 
types of relationships with persons who come 
for help. Curran defines counseling as “a 
definite relationship where, through the coun- 
selor’s sensitive understanding and_ skillful 
responses, a person objectively surveys the 
past and present factors which enter into his 
personal confusions and conflicts and, at the 
same time, reorganizes his emotional reac- 
tions so that he not only chooses better ways 
to reach his reasonable goals, but has suf- 
ficient confidence, courage, and moderation to 
act on these choices.” Elsewhere he has 
defined guidance as “a relationship in which 
a person equipped in a particular field sup- 
plies pertinent facts to an immediate personal 
need. Guidance readiness occurs,” he says, 
“when a unique convergence of events in a 
person’s personal life makes a particular kind 
of information far more meaningful at a given 
point in life than it would be at any other 
period.” In this book, the discussion of coun- 
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seling presupposes adequate instruction and 
information obtained from teaching or guid- 
ance and treats these questions only to ex- 
plain more clearly some aspect of counseling 
or reasons for a particular kind of counselor 
method. 

The book is made up of five parts. The 
first part includes some important recent de- 
velopments in counseling. The second sec- 
tion delineates the process of personal in- 
tegration as it occurs in counseling from the 
point of view of the person who comes for 
counseling. This is of special interest for 
beginning counselors who may not have an 
experiential grasp of a counseling series as 
the person goes through it. The experienced 
counselor, too, may find, as did the present 
reviewer, numerous considerations not treated 
in other books on counseling. Part IIT pre- 
sents the counselor’s side by unfolding the 
skill of the counselor as it varies throughout 
the different phases of counseling. This sec- 
tion gives a detailed exposition of the coun- 
selor’s skill in each of the five stages of coun- 
seling described: establishing the relationship, 
initiating counseling dynamics, later phases, 
and the final stages of counseling. A chapter 
on skills with children is also included. Most 
of this part is given over to the different 
methods of the counselor’s responses so that 
deepest content of a person’s statements may 
be objectively unfolded and reflected. The 
detailed excerpts from actual interviews are 
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exceptionally suitable for a careful study of 
the counselor’s skill. Part IV, the approach 
to counseling, is directed to increasing coun- 
selor sensitivity to counseling atmosphere, to 
disguised expressions of counseling need, and 
to the ways in which informational and 
guidance roles may facilitate counseling. It 
includes also a chapter on group discussion 
and group counseling. The concluding chap- 
ter is an integration of counseling with re- 
ligion. This is especially valuable since both 
counseling and religion aim at aiding a per- 
son to be more at peace with God and him- 
self, happier, and more able to lead an in- 
dependent, responsible, achieving life. 

While this book has a definitely Catholic 
application as its title indicates, yet the title 
could be misunderstood and therefore mis- 
leading. The content of the book would 
readily be shared, in this reviewer’s opinion, 
by chaplains of any denomination as well as 
by psychologists, psychiatrists, and educators 
and, in fact, by any persons who have active 
religious beliefs and convictions and wish to 
see how such a religious point of view can 
be integrated with modern methods of coun- 
seling and guidance. A special merit of the 
book is that it achieves this integration with- 
out losing any of the rigor and exactness of 
a careful scientific study. 

Robert J. Sherry 


Hq. Army Field Forces, 
Fort Monroe, Virginia 
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Books, monographs, and pamphlets for listing and possible review should be sent to Donald G. Paterson, 
Editor, Department of Psychology, University of Minnesota, Minneapolis 14, Minnesota 


Understanding that boy of yours. 
plegate. Washington, D. C.: Public Affairs Press, 
1953. Pp. 52. 

Rudolf Pintner, in memoriam. Seth Arsenian, Ed. 
Washington, D. C.: Gallaudet College Press, 1953 
Pp. 63. 

Innovation, the basis of cultural change. H. G. 
Barnett. New York: McGraw-Hill Book Co., 
Inc., 1953. Pp. 462. $6.50. 

Getting along with people. Eugene J. Benge. New 
London: Bureau of Business Practice, National 
Foremen’s Institute, Inc., 1952. Pp. 29. $.25. 

Practical psychology. Karl S. Bernhardt. Second 
edition. New York: McGraw-Hill Book Co., Inc., 
1953. Pp. 337. $3.75. 

Psychoanalytic theories of personality. Gerald S. 
Blum. New York: McGraw-Hill Book Co., Inc., 
1953. Pp. 219. $3.75. 

Social factors related to job satisfaction. Research 
Monograph No. 70. Robert P. Bullock. Colum- 
bus: Bureau of Business Research, Ohio State Uni- 
versity, 1952. Pp. 105. $2.00. 

Human relations I. Cases in concrete social science. 
Hugh Cabot and Joseph A. Kahl. Cambridge: 
Harvard University Press, 1953. Pp. 273. $4.25. 

Human relations I]. Concepts in concrete social sci- 
ence. Hugh Cabot and Joseph A. Kahl. Cam- 
bridge: Harvard University Press, 1953. Pp. 333. 
$4.75. 

Phantasy in childhood. Audrey Davidson and Judith 
Fay. New York: Philosophical Library, 1953. 
Pp. 188. $4.75. 

Marriage, morals and sex in America. Sidney Dit- 
zion. New York: Bookman Associates, 1953. Pp. 
440. $4.50. 

Statistics in psychology and education. Henry E. 
Garrett. New York: Longmans, Green and Co., 
Inc., 1953. Pp. 460. $5.00. 

The human senses. Frank A. Geldard. New York: 
John Wiley and Sons, Inc., 1953. Pp. 365. $5.00. 

The intimate life. J. Norval Geldenhuys. New 
York: Philosophical Library, 1952. Pp. 96. $2.75. 
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